AKS Scheduler Behavior with Taints and Tolerations on Mixed Nodes

Suresh Shankar 20 Reputation points
2024-11-29T16:27:30.04+00:00

I recently applied taints and tolerations in an AKS cluster with multiple nodes. The setup consists of one system node and two user nodes. I tainted one of the user nodes, while the remaining two nodes (one user node and the system node) do not have taints.

When I run a YAML pod that has a toleration matching the taint on the tainted user node, I observed that the pod is being scheduled on the system node, which has no taints. This behavior is unexpected.

It seems that if any node in the cluster is missing the specified taint, the scheduler will assign the pod to that node, even if there are nodes with matching taints and tolerations. Is this the expected behavior of the AKS scheduler?

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,197 questions
{count} votes

Accepted answer
  1. Akram Kathimi 1,201 Reputation points Microsoft Employee
    2024-11-30T09:07:12.01+00:00

    Hi @Suresh Shankar ,

    Akram from the AKS team here. Thank you for posting this question.

    Yes, this behavior is expected.

    When you taint a node/nodepool, you are telling the nodes to only allow pods that tolerate this taint to be scheduled on the node.

    However, you are not telling the pod that it is only allowed to run on these nodes.
    To do so, node selectors or node affinity should be used. Please read about them here.

    Please mark this as the correct answer if it helped you.

    Thanks.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Andriy Bilous 11,536 Reputation points MVP
    2024-11-29T21:00:02.6333333+00:00

    Hello Suresh Shankar

    Node taints work by marking a node so that the scheduler avoids placing certain pods on the marked nodes. You can place tolerations on a pod to allow the scheduler to schedule that pod on a node with a matching taint. Taints and tolerations work together to help you control how the scheduler places pods onto nodes. For more information, see example use cases of taints and tolerations.

    Taints are key-value pairs with an effect. There are three values for the effect field when using node taints: NoExecute, NoSchedule, and PreferNoSchedule.

    • NoExecute: Pods already running on the node are immediately evicted if they don't have a matching toleration. If a pod has a matching toleration, it might be evicted if tolerationSeconds are specified.
    • NoSchedule: Only pods with a matching toleration are placed on this node. Existing pods aren't evicted.
    • PreferNoSchedule: The scheduler avoids placing any pods that don't have a matching toleration.

    https://learn.microsoft.com/en-us/azure/aks/use-node-taints

    Dedicated Nodes: If you want to dedicate a set of nodes for exclusive use by a particular set of users, you can add a taint to those nodes (say, kubectl taint nodes nodename dedicated=groupName:NoSchedule) and then add a corresponding toleration to their pods (this would be done most easily by writing a custom admission controller). The pods with the tolerations will then be allowed to use the tainted (dedicated) nodes as well as any other nodes in the cluster. If you want to dedicate the nodes to them and ensure they only use the dedicated nodes, then you should additionally add a label similar to the taint to the same set of nodes (e.g. dedicated=groupName), and the admission controller should additionally add a node affinity to require that the pods can only schedule onto nodes labeled with dedicated=groupName.

    https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.