กิจกรรม
17 มี.ค. 21 - 21 มี.ค. 10
แอปอัจฉริยะ เข้าร่วมชุด meetup เพื่อสร้างโซลูชัน AI ที่ปรับขนาดได้ตามกรณีการใช้งานจริงกับนักพัฒนาและผู้เชี่ยวชาญร่วมกัน
ลงทะเบียนตอนนี้เบราว์เซอร์นี้ไม่ได้รับการสนับสนุนอีกต่อไป
อัปเกรดเป็น Microsoft Edge เพื่อใช้ประโยชน์จากคุณลักษณะล่าสุด เช่น การอัปเดตความปลอดภัยและการสนับสนุนด้านเทคนิค
As you manage clusters in Azure Kubernetes Service (AKS), you often need to isolate teams and workloads. Advanced features provided by the Kubernetes scheduler let you control:
This best practices article focuses on advanced Kubernetes scheduling features for cluster operators. In this article, you learn how to:
Best practice guidance:
Limit access for resource-intensive applications, such as ingress controllers, to specific nodes. Keep node resources available for workloads that require them, and don't allow scheduling of other workloads on the nodes.
When you create your AKS cluster, you can deploy nodes with GPU support or a large number of powerful CPUs. For more information, see Use GPUs on AKS. You can use these nodes for large data processing workloads such as machine learning (ML) or artificial intelligence (AI).
Because this node resource hardware is typically expensive to deploy, limit the workloads that can be scheduled on these nodes. Instead, dedicate some nodes in the cluster to run ingress services and prevent other workloads.
This support for different nodes is provided by using multiple node pools. An AKS cluster supports one or more node pools.
The Kubernetes scheduler uses taints and tolerations to restrict what workloads can run on nodes.
When you deploy a pod to an AKS cluster, Kubernetes only schedules pods on nodes whose taint aligns with the toleration. Taints and tolerations work together to ensure that pods aren't scheduled onto inappropriate nodes. One or more taints are applied to a node, marking the node so that it doesn't accept any pods that don't tolerate the taints.
For example, assume you added a node pool in your AKS cluster for nodes with GPU support. You define name, such as gpu, then a value for scheduling. Setting this value to NoSchedule restricts the Kubernetes scheduler from scheduling pods with undefined toleration on the node.
az aks nodepool add \
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name taintnp \
--node-taints sku=gpu:NoSchedule \
--no-wait
With a taint applied to nodes in the node pool, you define a toleration in the pod specification that allows scheduling on the nodes. The following example defines the sku: gpu
and effect: NoSchedule
to tolerate the taint applied to the node pool in the previous step:
kind: Pod
apiVersion: v1
metadata:
name: app
spec:
containers:
- name: app
image: <your-workload>:gpu
resources:
requests:
cpu: 0.5
memory: 2Gi
limits:
cpu: 4.0
memory: 16Gi
tolerations:
- key: "sku"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
When this pod is deployed using kubectl apply -f gpu-toleration.yaml
, Kubernetes can successfully schedule the pod on the nodes with the taint applied. This logical isolation lets you control access to resources within a cluster.
When you apply taints, work with your application developers and owners to allow them to define the required tolerations in their deployments.
For more information about how to use multiple node pools in AKS, see Create multiple node pools for a cluster in AKS.
When you upgrade a node pool in AKS, taints and tolerations follow a set pattern as they're applied to new nodes:
You can taint a node pool from the AKS API to have newly scaled out nodes receive API specified node taints.
Let's assume:
Again, let's assume:
In essence, node1 becomes node3, and node2 becomes the new node1.
When you scale a node pool in AKS, taints and tolerations don't carry over by design.
Best practice guidance
Control the scheduling of pods on nodes using node selectors, node affinity, or inter-pod affinity. These settings allow the Kubernetes scheduler to logically isolate workloads, such as by hardware in the node.
Taints and tolerations logically isolate resources with a hard cut-off. If the pod doesn't tolerate a node's taint, it isn't scheduled on the node.
Alternatively, you can use node selectors. For example, you label nodes to indicate locally attached SSD storage or a large amount of memory, and then define in the pod specification a node selector. Kubernetes schedules those pods on a matching node.
Unlike tolerations, pods without a matching node selector can still be scheduled on labeled nodes. This behavior allows unused resources on the nodes to consume, but prioritizes pods that define the matching node selector.
Let's look at an example of nodes with a high amount of memory. These nodes prioritize pods that request a high amount of memory. To ensure the resources don't sit idle, they also allow other pods to run. The following example command adds a node pool with the label hardware=highmem to the myAKSCluster in the myResourceGroup. All nodes in that node pool have this label.
az aks nodepool add \
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name labelnp \
--node-count 1 \
--labels hardware=highmem \
--no-wait
A pod specification then adds the nodeSelector
property to define a node selector that matches the label set on a node:
kind: Pod
apiVersion: v1
metadata:
name: app
spec:
containers:
- name: app
image: <your-workload>:gpu
resources:
requests:
cpu: 0.5
memory: 2Gi
limits:
cpu: 4.0
memory: 16Gi
nodeSelector:
hardware: highmem
When you use these scheduler options, work with your application developers and owners to allow them to correctly define their pod specifications.
For more information about using node selectors, see Assigning Pods to Nodes.
A node selector is a basic solution for assigning pods to a given node. Node affinity provides more flexibility, allowing you to define what happens if the pod can't be matched with a node. You can:
The following example sets the node affinity to requiredDuringSchedulingIgnoredDuringExecution. This affinity requires the Kubernetes schedule to use a node with a matching label. If no node is available, the pod has to wait for scheduling to continue. To allow the pod to be scheduled on a different node, you can instead set the value to preferredDuringSchedulingIgnoreDuringExecution:
kind: Pod
apiVersion: v1
metadata:
name: app
spec:
containers:
- name: app
image: <your-workload>:gpu
resources:
requests:
cpu: 0.5
memory: 2Gi
limits:
cpu: 4.0
memory: 16Gi
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hardware
operator: In
values:
- highmem
The IgnoredDuringExecution part of the setting indicates that the pod shouldn't be evicted from the node if the node labels change. The Kubernetes scheduler only uses the updated node labels for new pods being scheduled, not pods already scheduled on the nodes.
For more information, see Affinity and anti-affinity.
One final approach for the Kubernetes scheduler to logically isolate workloads is using inter-pod affinity or anti-affinity. These settings define that pods either shouldn't or should be scheduled on a node that has an existing matching pod. By default, the Kubernetes scheduler tries to schedule multiple pods in a replica set across nodes. You can define more specific rules around this behavior.
For example, you have a web application that also uses an Azure Cache for Redis.
The distribution of pods across nodes looks like the following example:
Node 1 | Node 2 | Node 3 |
---|---|---|
webapp-1 | webapp-2 | webapp-3 |
cache-1 | cache-2 | cache-3 |
Inter-pod affinity and anti-affinity provide a more complex deployment than node selectors or node affinity. With the deployment, you logically isolate resources and control how Kubernetes schedules pods on nodes.
For a complete example of this web application with Azure Cache for Redis example, see Co-locate pods on the same node.
This article focused on advanced Kubernetes scheduler features. For more information about cluster operations in AKS, see the following best practices:
คำติชม Azure Kubernetes Service
Azure Kubernetes Service เป็นโครงการโอเพนซอร์ส เลือกลิงก์เพื่อให้คำติชม:
กิจกรรม
17 มี.ค. 21 - 21 มี.ค. 10
แอปอัจฉริยะ เข้าร่วมชุด meetup เพื่อสร้างโซลูชัน AI ที่ปรับขนาดได้ตามกรณีการใช้งานจริงกับนักพัฒนาและผู้เชี่ยวชาญร่วมกัน
ลงทะเบียนตอนนี้การฝึกอบรม
เส้นทางการเรียนรู้
แอปพลิเคชันบริการ Azure Kubernetes (AKS) และการปรับขนาดคลัสเตอร์ - Training
แอปพลิเคชันบริการ Azure Kubernetes (AKS) และการปรับขนาดคลัสเตอร์
ใบรับรอง
ได้รับการรับรองจาก Microsoft: Azure สําหรับปริมาณงาน SAP พิเศษ - Certifications
แสดงให้เห็นถึงการวางแผน การโยกย้ายข้อมูล และการดําเนินการของโซลูชัน SAP บน Microsoft Azure ในขณะที่คุณใช้ประโยชน์จากทรัพยากร Azure