Events
Mar 17, 9 PM - Mar 21, 10 AM
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register nowThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
This section of the Azure Kubernetes Service (AKS) day-2 operations guide describes patching and upgrading strategies for AKS worker nodes and Kubernetes versions. As a cluster operator, you need to have a plan for keeping your clusters up to date and monitoring Kubernetes API changes and deprecations over time.
There are three types of updates for AKS, each one building on the next:
Update type | Frequency of upgrade | Planned Maintenance supported | Supported operation methods | Target | Link to documentation |
---|---|---|---|---|---|
Node OS security patches | Nightly | Yes | Automatic (weekly), manual/unmanaged (nightly) | Node | Auto Upgrade Node Images |
Node image version upgrades | Linux: Weekly Windows: Monthly |
Yes | Automatic, manual | Node pool | AKS node image upgrade |
Kubernetes version (cluster) upgrades | Quarterly | Yes | Automatic, manual | Cluster and node pool | AKS cluster upgrade |
Node OS security patches (Linux only). For Linux nodes, both Canonical Ubuntu and Azure Linux make operating system security patches available once per day. Microsoft tests and bundles these patches in the weekly updates to node images.
Weekly updates to node images. AKS provides weekly updates to node images. These updates include the latest OS and AKS security patches, bug fixes, and enhancements. Node updates don't change the Kubernetes version. Versions are formatted by date (for example, 202311.07.0) for Linux and by Windows Server OS build and date (for example, 20348.2113.231115) for Windows. For more information, see AKS Release Status.
Quarterly Kubernetes releases. AKS provides quarterly updates for Kubernetes releases. These updates allow AKS users to take advantage of the latest Kubernetes features and enhancements. They include security patches and node image updates. For more information, see Supported Kubernetes versions in AKS.
To ensure the smooth operation of your AKS cluster during maintenance, follow these best practices:
Warning
Misconfigured PDBs can block the upgrade process because the Kubernetes API prevents the necessary cordon and drain that occurs with a rolling node-image upgrade. Additionally, if too many pods are moved simultaneously, an application outage can occur. PDB configuration mitigates this risk.
Microsoft creates a new node image for AKS nodes approximately once per week. A node image contains up-to-date OS security patches, OS kernel updates, Kubernetes security updates, updated versions of binaries like kubelet, and component version updates that are listed in the release notes.
When a node image is updated, a cordon and drain action is triggered on the target node pool's nodes:
A similar process occurs during a cluster upgrade.
Generally speaking, most clusters should use the NodeImage
update channel. This channel provides an updated node image VHD on a weekly basis and is updated according to your cluster's maintenance window.
Available channels include the following:
None
. No updates are automatically applied.Unmanaged
. Ubuntu and Azure Linux updates are applied by the OS on a nightly basis. Reboots must be managed separately. AKS is neither able to test this nor control the cadence of this.SecurityPatch
. OS security patches which are AKS-tested, fully managed, and applied with safe deployment practices. It doesn't contain any OS bug fixes just security updates.NodeImage
. AKS updates the nodes with a newly patched VHD containing security fixes and bug fixes on a weekly cadence. This is fully tested and deployed with safe deployment practices. For real-time information about currently deployed node images, see AKS node image updates.To understand the default cadences without a maintenance window established, see update ownership and cadence.
If you choose the Unmanaged
update channel, you need to manage the reboot process by using a tool like kured. Unmanaged
doesn't come with AKS-provided safe deployment practices and won't work under maintenance windows. If you choose the SecurityPatch
update channel, updates can be applied as frequently as weekly. This patch level requires the VHDs to be stored in your resource group, which incurs a nominal charge. Control when the SecurityPatch
is applied by setting an appropriate aksManagedNodeOSUpgradeSchedule
that aligns to a cadence that works best for your workload. For more information, see Creating a maintenance window. If you also need bug fixes that come typically with new node images (VHD), then you need to choose the NodeImage
channel instead of SecurityPatch
.
As a best practice, use the NodeImage
update channel and configure an aksManagedNodeOSUpgradeSchedule
maintenance window to a time when the cluster is outside of peak usage windows.
See Creating a maintenance window for attributes that you can use to configure the cluster maintenance window. The key attributes are:
name
. Use aksManagedNodeOSUpgradeSchedule
for node OS updates.utcOffset
. Configure the time zone.startTime
. Set the start time of the maintenance window.dayofWeek
. Set the days of the week for the window. For example, Saturday
.schedule
. Set the frequency of the window. For NodeImage
updates, we recommend weekly
.durationHours
. Set this attribute to at least four hours.This example sets a weekly maintenance window to 8:00 PM Eastern Time on Saturdays:
az aks maintenanceconfiguration add -g <ResourceGroupName> --cluster-name <AKSClusterName> --name aksManagedNodeOSUpgradeSchedule --utc-offset=-05:00 --start-time 20:00 --day-of-week Saturday --schedule-type weekly --duration 4
For more examples, see Add a maintenance window configuration with Azure CLI.
This configuration would ideally be deployed as part of the infrastructure-as-code deployment of the cluster.
You can check for configured maintenance windows by using the Azure CLI:
az aks maintenanceconfiguration list -g <ResourceGroupName> --cluster-name <AKSClusterName>
You can also check the details of a specific maintenance window by using the CLI:
az aks maintenanceconfiguration show -g <ResourceGroupName> --cluster-name <AKSClusterName> --name aksManagedNodeOSUpgradeSchedule
If a cluster maintenance window isn't configured, node image updates occur biweekly. As much as possible, AKS maintenance occurs within the configured window, but the time of maintenance isn't guaranteed.
Important
If you have a node pool with a large number of nodes but it isn't configured with node surge, the auto upgrade event might not trigger. Node images in a node pool will only be upgraded while the estimated total upgrade time is within 24 hours.
In this situation, you can consider one of the following:
To monitor the status of updates automatically you can use AKS Service Communication Manager to provide automatic alerts for planned maintenance activities. Alternatively, you can monitor directly via Azure activity logs or by reviewing the resource logs on the cluster directly via kubectl get events
.
You can Subscribe to Azure Kubernetes Service (AKS) events with Azure Event Grid which includes AKS upgrade events. These events can alert you when new version of Kubernetes is available and help to track node status changes during upgrade processes.
You can also manage the weekly update process by using GitHub Actions. This method provides more granular control of the update process.
You can use the kubectl describe nodes command to determine the OS kernel version and the OS image version of the nodes in your cluster:
kubectl describe nodes <NodeName>
Example output (truncated):
System Info:
Machine ID: bb2e85e682ae475289f2e2ca4ed6c579
System UUID: 6f80de9d-91ba-490c-8e14-9e68b7b82a76
Boot ID: 3aed0fd5-5d1d-4e43-b7d6-4e840c8ee3cf
Kernel Version: 5.15.173.1-1.cm2
OS Image: CBL-Mariner/Linux
Operating System: linux
Architecture: arm64
Container Runtime Version: containerd://1.6.26
Kubelet Version: v1.31.3
Kube-Proxy Version: v1.31.3
Use the Azure CLI az aks nodepool list command to determine the node image versions of the nodes in a cluster:
az aks nodepool list \
--resource-group <ResourceGroupName> --cluster-name <AKSClusterName> \
--query "[].{Name:name,NodeImageVersion:nodeImageVersion}" --output table
Example output:
Name NodeImageVersion
--------- ---------------------------------------------
systempool AKSUbuntu-2204gen2containerd-202307.12.0
usernodepool AKSUbuntu-2204gen2arm64containerd-202307.12.0
Use az aks nodepool get-upgrades to determine the latest available node image version for a specific node pool:
az aks nodepool get-upgrades \
--resource-group <ResourceGroupName> \
--cluster-name <AKSClusterName> \
--nodepool-name <NodePoolName> --output table
Example output:
Name NodeImageVersion
------ -------------------------------------
system AKSAzureLinux-V2gen2-202501.12.0
user AKSAzureLinux-V2gen2arm64-202501.12.0
The Kubernetes community releases minor versions of Kubernetes approximately every three months. To keep you informed about new AKS versions and releases, the AKS release notes page is updated regularly. You can also subscribe to the GitHub AKS RSS feed, which provides real-time updates about changes and enhancements.
AKS follows an N - 2 support policy, which means that full support is provided for the latest release (N) and two previous minor versions. Limited platform support is offered for the third prior minor version. For more information, see AKS support policy.
To ensure that your AKS clusters remain supported, you need to establish a continuous cluster upgrade process. This process involves testing new versions in lower environments and planning the upgrade to production before the new version becomes the default. This approach can maintain predictability in your upgrade process and minimize disruptions to applications. For more information, see Upgrade an AKS cluster.
If your cluster requires a longer upgrade cycle, use AKS versions that support the Long Term Support (LTS) option. If you enable the LTS option, Microsoft provides extended support for Kubernetes versions for two years, which enables a more prolonged and controlled upgrade cycle. For more information, see Supported Kubernetes versions in AKS.
A cluster upgrade includes a node upgrade and uses a similar cordon and drain process.
As a best practice, you should always upgrade and test in lower environments to minimize the risk of disruption in production. Cluster upgrades require extra testing because they involve API changes, which can affect Kubernetes deployments. The following resources can assist you in the upgrade process:
In addition to these Microsoft resources, consider using open-source tools to optimize your cluster upgrade process. One such tool is Fairwinds pluto, which can scan your deployments and Helm charts for deprecated Kubernetes APIs. These tools can help you ensure that your applications remain compatible with the latest Kubernetes versions.
To check when your cluster requires an upgrade, use az aks get-upgrades to get a list of available upgrade versions for your AKS cluster. Determine the target version for your cluster from the results.
Here's an example:
az aks get-upgrades \
--resource-group <ResourceGroupName> --name <AKSClusterName> --output table
Example output:
MasterVersion Upgrades
------------- ---------------------------------
1.30.7 1.31.1, 1.31.2, 1.31.3
Check the Kubernetes versions of the nodes in your node pools to determine the pools that need to be upgraded:
az aks nodepool list \
--resource-group <ResourceGroupName> --cluster-name <AKSClusterName> \
--query "[].{Name:name,k8version:orchestratorVersion}" --output table
Example output:
Name K8version
------------ ------------
systempool 1.30.7
usernodepool 1.30.7
To minimize disruptions and help ensure a smooth upgrade for your AKS cluster, follow this upgrade approach:
By following this approach, you can minimize disruptions during the upgrade process and maintain the availability of your applications. These are the detailed steps:
Run the az aks upgrade command with the --control-plane-only
flag to upgrade only the cluster control plane and not the cluster's node pools:
az aks upgrade \
--resource-group <ResourceGroupName> --name <AKSClusterName> \
--control-plane-only \
--kubernetes-version <KubernetesVersion>
Run az aks nodepool upgrade to upgrade node pools to the target version:
az aks nodepool upgrade \
--resource-group <ResourceGroupName> --cluster-name <AKSClusterName> --name <NodePoolName> \
--no-wait --kubernetes-version <KubernetesVersion>
During the node pool upgrade, AKS creates a surge node, cordons, and drains pods in the node that's being upgraded, reimages the node, and then uncordons the pods. This process then repeats for any other nodes in the node pool.
You can check the status of the upgrade process by running kubectl get events
.
For information about troubleshooting cluster upgrade problems, see AKS troubleshooting documentation.
AKS also offers an automatic cluster upgrade solution to keep your cluster up to date. If you use this solution, you should pair it with a maintenance window to control the timing of upgrades. The upgrade window must be four hours or more. When you enroll a cluster in a release channel, Microsoft automatically manages the version and upgrade cadence for the cluster and its node pools.
The cluster auto-upgrade offers different release channel options. Here's a recommended environment and release channel configuration:
Environment | Upgrade channel | Description |
---|---|---|
Production | stable |
For stability and version maturity, use the stable or regular channel for production workloads. |
Staging, testing, development | Same as production | To ensure that your tests are indicative of the version that you'll upgrade your production environment to, use the same release channel as production. |
Canary | rapid |
To test the latest Kubernetes releases and new AKS features or APIs, use the rapid channel. You can improve your time to market when the version in rapid is promoted to the channel you're using for production. |
The following table describes the characteristics of various AKS upgrade and patching scenarios:
Scenario | User initiated | Kubernetes upgrade | OS kernel upgrade | Node image upgrade |
---|---|---|---|---|
Security patching | No | No | Yes, after reboot | Yes |
Cluster creation | Yes | Maybe | Yes, if an updated node image uses an updated kernel | Yes, relative to an existing cluster if a new release is available |
Control plane Kubernetes upgrade | Yes | Yes | No | No |
Node pool Kubernetes upgrade | Yes | Yes | Yes, if an updated node image uses an updated kernel | Yes, if a new release is available |
Node pools scale-up | Yes | No | No | No |
Node image upgrade | Yes | No | Yes, if an updated node image uses an updated kernel | Yes |
Cluster auto-upgrade | No | Yes | Yes, if an updated node image uses an updated kernel | Yes, if a new release is available |
This article is maintained by Microsoft. It was originally written by the following contributors.
Principal author:
Other contributors:
To see nonpublic LinkedIn profiles, sign in to LinkedIn.
Events
Mar 17, 9 PM - Mar 21, 10 AM
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register nowTraining
Module
Apply cluster upgrades and security patches with Azure Kubernetes Service - Training
Apply the latest version upgrades and patches to your Azure Kubernetes Service clusters.
Certification
Microsoft Certified: Azure Administrator Associate - Certifications
Demonstrate key skills to configure, manage, secure, and administer key professional functions in Microsoft Azure.