Common issues when you run or scale large AKS clusters FAQ
This article answers frequently asked questions about common issues that might occur when you run or scale large clusters in Microsoft Azure Kubernetes Service (AKS). A large cluster is any cluster that runs at more than a 500-node scale.
I get a "quota exceeded" error during creation, scale up, or upgrade
To resolve this issue, create a support request in the subscription in which you're trying to create, scale, or upgrade, and request a quota for the corresponding resource type. For more information, see regional compute quotas.
I get an "insufficientSubnetSize" error when I deploy an AKS cluster that uses advanced networking
This error indicates that a subnet that's in use for a cluster no longer has available IPs within its CIDR for successful resource assignment. This issue can occur during upgrades, scale outs, or node pool creation. This issue occurs because the number of free IPs in the subnet is less than the result of the following formula:
number of nodes requested * node pool
--max-pod
value
Prerequisites
To scale beyond 400 nodes, you have to use the Azure CNI networking plug-in.
To help plan your virtual network and subnets to accommodate the number of nodes and pods that you're deploying, see planning IP addresses for your cluster. To reduce the overhead of subnet planning or cluster re-creation that you would do because of IP exhaustion, see Dynamic IP allocation.
Solution
Because you can't update an existing subnet's CIDR range, you must have permission to create a new subnet to resolve this issue. Follow these steps:
Rebuild a new subnet that has a larger CIDR range that's sufficient for operation goals.
Create a new subnet that has a new non-overlapping range.
Create a new node pool on the new subnet.
Drain pods from the old node pool that resides in the old subnet that will be replaced.
Delete the old subnet and old node pool.
I'm having sporadic egress connectivity failures because of SNAT port exhaustion
For clusters that run at a relatively large scale (more than 500 nodes), we recommend that you use the AKS Managed Network Address Translation (NAT) Gateway for greater scalability. Azure NAT Gateway allows up to 64,512 outbound UDP and TCP traffic flows per IP address, and a maximum of 16 IP addresses.
If you're not using Managed NAT, see Troubleshoot source network address translation (SNAT) exhaustion and connection timeouts to understand and resolve SNAT port exhaustion issues.
I can't scale up to 5,000 nodes using the Azure portal
Use the Azure CLI to scale up to a maximum of 5,000 nodes by following these steps:
Create a minimum number of node pools in the cluster (because the maximum node pool node limit is 1,000) by running the following command:
az aks nodepool add --resource-group MyResourceGroup --name nodepool1 --cluster-name MyManagedCluster
Scale up the node pools one at a time. Ideally, set five minutes of sleep time between consecutive scale-ups of 1,000. Run the following command:
az aks nodepool scale --resource-group MyResourceGroup --name nodepool1 --cluster-name MyManagedCluster
My upgrade is running, but it's slow
In its default configuration, AKS surges during an upgrade by taking the following actions:
- Creating one new node.
- Scaling the node pool beyond the desired number of nodes by one node.
For the max surge settings, a default value of one node means that AKS creates one new node before it drains the existing applications and replaces an earlier-versioned node. This extra node lets AKS minimize workload disruption.
When you upgrade clusters that have many nodes, it can take several hours to upgrade the entire cluster if you use the default value of max-surge
. You can customize the max-surge
property per node pool to enable a tradeoff between upgrade speed and upgrade disruption. By increasing the max surge value, you enable the upgrade process to finish sooner. However, a large value for max surge might also cause disruptions during the upgrade process.
Run the following command to increase or customize the max surge for an existing node pool:
az aks nodepool update --resource-group MyResourceGroup --name mynodepool --cluster-name MyManagedCluster --max-surge 5
It's also important to consider how your deployment settings might delay the completion of the upgrade or scale operation:
- SKU family B series VMs are not supported by AKS in the system nodepool and they can experience low performance during and after updates.
- Check your deployment's PDB resource settings to ensure they are accurate for a successful upgrade. For more information, see AKS workload best practices.
Tip
To get more insights about this behavior, you can view error details on the Activity Log page in the Azure portal or review the resource logs on your cluster.
My upgrade is reaching the quota (5,000 cluster) limit
To resolve this issue, see Increase regional vCPU quotas.
My internal service creation at more than 750 nodes is slow or failing because of a time-out error
Standard Load Balancer (SLB) back-end pool updates are a known performance bottleneck. We're working on a new capability that will enable faster creation of services and SLB at scale. To send us your feedback about this issue, see Azure Kubernetes support for load balancer with IP-based back-end pool.
Solution
We recommend that you scale down the cluster to fewer than 750 nodes, and then create an internal load balancer for the cluster. To create an internal load balancer, create a LoadBalancer
service type and azure-load-balancer-internal
annotation, per the following example procedure.
Step 1: Create an internal load balancer
To create an internal load balancer, create a service manifest that's named internal-lb.yaml and that contains the LoadBalancer
service type and the azure-load-balancer-internal
annotation, as shown in the following example:
apiVersion: v1
kind: Service
metadata:
name: internal-app
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: internal-app
Step 2: Deploy the internal load balancer
Deploy the internal load balancer by using the kubectl apply
command, and specify the name of your YAML manifest, as shown in the following example:
kubectl apply -f internal-lb.yaml
After your cluster is created, you can also provision an internal load balancer (per this procedure), and keep an internal load-balanced service running. Doing this enables you to add more services to the load balancer at scale.
SLB service creation at scale takes hours to run
SLB back-end pool updates are a known performance bottleneck. We're working on a new capability that will allow you to run load balanced services at scale with considerably faster performance for create, update, and delete operations. To send us your feedback, see Azure Kubernetes support for load balancer with IP-based back-end pool.
Third-party information disclaimer
The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, about the performance or reliability of these products.
Contact us for help
If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.