Strategies regarding number of node pools and their management?

Tanul 1,251 Reputation points
2023-04-13T07:52:50.76+00:00

Team, We need to create a new cluster regarding which I have few questions:

  1. How many node pools are considered as part of better management of production cluster
  2. If multiple node pools are a good approach, then how to organize the user and system nodes across the multiple node pools and how many nodes should we keep only for system node pools
  3. We are going to use internal load balancer and delete the default load balancer which AKS creates. Does using this leads to issues in the connectivity across node pools.
  4. Can each node pool has nodes of different size. For eg. node pool 1 has 2 nodes of sku B2s and Node pool 2 has all nodes of size D2S
  5. Should we use Azure CNI or Kubenet for networking in such case.
  6. Once our architecture diagram is ready, in Microsoft, is it possible to connect with any architect who can review our architecture and give us the sign off on the design. Should we reach out to our CSAM for this.

Would be grateful if someone could suggest us on these queries. Thank you very much. Kind Regards, Tanul

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
1,877 questions
Azure Virtual Machine Scale Sets
Azure Virtual Machine Scale Sets
Azure compute resources that are used to create and manage groups of heterogeneous load-balanced virtual machines.
352 questions
0 comments No comments
{count} votes

Accepted answer
  1. Andrei Barbu 2,576 Reputation points Microsoft Employee
    2023-04-13T08:19:03.42+00:00

    Hello Tanul Let me try to answer your questions below:
    1 - As per https://learn.microsoft.com/en-us/azure/aks/use-system-pools?tabs=azure-cli, it is recommended to have at least system node pool dedicated for system components and a user node pool for application workload. So the answer is at least two node pools.

    2 - Labels and/or taints/tolerations can be used to separate system and application workload. The same link from the above should be read for more details. The number of nodes depends on your workload, and you should assess this based on how intensive your application(s) are. As per that link, "If you run a single system node pool for your AKS cluster in a production environment, we recommend you use at least three nodes for the node pool."

    3 - You should not delete the default load balancer (called "kubernetes") that the AKS cluster creates. That is not supported and will bring your AKS cluster into an unsupported scenario. If you don't want to have the Load Balancer, you should use UDR outbound type. More details: https://learn.microsoft.com/en-us/azure/aks/egress-udr

    4 - Yes, each node pool can have different SKU sizes or the same size. There is no restriction here.

    5 - The details provided are not enough to provide an opinion of Azure CNI or Kubenet should be used. I would recommend you reading the below to understand what is best for you:

    https://learn.microsoft.com/en-us/azure/aks/concepts-network#kubenet-basic-networking

    https://learn.microsoft.com/en-us/azure/aks/configure-kubenet

    https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni

    6 - Yes, you should contact the CSAM to assign a Cloud Solution Architect, if available as per your contract.

    Please "Accept the answer" and "Upvote" it if it was helpful.

    Thank you!


1 additional answer

Sort by: Most helpful
  1. Ammar-Abdelqader01 926 Reputation points Microsoft Employee
    2023-04-13T08:38:04.4166667+00:00

    Hello @Tanul let me answer your questions as below:

    1 - How many node pools are considered as part of better management of the production cluster of at least two node-pools:

    a - system-node pool with at least one node https://learn.microsoft.com/en-us/azure/aks/use-system-pools?tabs=azure-cli
    User's image

    and to make sure the system pods are scheduled on it you should an annotations https://learn.microsoft.com/en-us/azure/aks/use-system-pools?tabs=azure-cli#add-a-dedicated-system-node-pool-to-an-existing-aks-cluster

    User's image

    b - Depending on your workload you can add user node-pools as much as you want to depend on your workload. 2 - If multiple node pools are a good approach, then how to organize the user and system nodes across the multiple node pools and how many nodes should we keep only for system node pools?

    here as mentioned in the first question you need at least one system node-pool regarding the user node-pool it depends on your application's workload resource such as CPU, Memory you need to calculate it, and depending on the workload you can add more user node-pools. 3 - We are going to use an internal load balancer and delete the default load balancer which AKS creates. Does using this leads to issues in the connectivity across node pools? it's not supported to make a manual deletion for aks resources once it has been created, you can create an AKS cluster using UDR here, it will not create a public load balancer, and you can restrict the egress traffic.
    https://learn.microsoft.com/en-us/azure/aks/limit-egress-traffic#deploy-aks-with-outbound-type-of-udr-to-the-existing-network
    or you can go with private AKS:
    https://learn.microsoft.com/en-us/azure/aks/private-clusters
    4 - Can each node pool has nodes of different size. For eg. node pool 1 has 2 nodes of sku B2s and Node pool 2 has all nodes of size D2S
    yes you can add a flag once you create AKS cluster or added a new node-pool to select the VM SKU using this flag:
    --node-vm-size Standard_D2pds_v5

    https://learn.microsoft.com/en-us/azure/aks/use-multiple-node-pools#add-a-node-pool
    5 - Should we use Azure CNI or Kubenet for networking in such case.
    you should be familiar with Kubenet and CNI to select what is better for your environment:

    1 - Kubenet is uses once you have many pods, and nodes and you want to reduce the number of the IPs on the subnet.
    here you will have more latency as the number of hops will be increase. https://learn.microsoft.com/en-us/azure/aks/configure-kubenet
    2 - using azure it will use higher number of IPs on the subnet, as the pods will take the same range of IPS on the subnet level, much better performance, as here the latency will be reduced. https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni
    6 - Once our architecture diagram is ready, in Microsoft, is it possible to connect with any architect who can review our architecture and give us the sign off on the design. Should we reach out to our CSAM for this?
    it depends on your support plan or higher CSAM will help to find CSA to help you in your architecture.
    Please "Accept the answer" and "Upvote" it if it was helpful. Thank you!