Cuir in eagar

Comhroinn trí


Use planned maintenance to schedule and control upgrades for your Azure Kubernetes Service cluster

This article shows you how to use planned maintenance to schedule and control cluster and node image upgrades in Azure Kubernetes Service (AKS).

Regular maintenance is performed on your AKS cluster automatically. There are two types of maintenance operations:

When you use the feature of planned maintenance in AKS, you can run both types of maintenance in a cadence of your choice to minimize workload impact. You can use planned maintenance to schedule the timing of automatic upgrades, but enabling or disabling planned maintenance won't enable or disable automatic upgrades.

Before you begin

  • This article assumes that you have an existing AKS cluster. If you don't have an AKS cluster, see Create an AKS cluster.
  • If you're using the Azure CLI, upgrade to the latest version by using the az upgrade command.

Considerations

When you use planned maintenance, the following considerations apply:

  • AKS reserves the right to break planned maintenance windows for unplanned, reactive maintenance operations that are urgent or critical. These maintenance operations might even run during the notAllowedTime or notAllowedDates periods defined in your configuration.
  • Maintenance operations are considered best effort only and aren't guaranteed to occur within a specified window.

Schedule configuration types for planned maintenance

Three schedule configuration types are available for planned maintenance:

  • default is a basic configuration for controlling AKS releases. The releases can take up to two weeks to roll out to all regions from the initial time of shipping, because of Azure safe deployment practices.

    Choose default to schedule these updates in a manner that's least disruptive for you. You can monitor the status of an ongoing AKS release by region with the weekly release tracker.

  • aksManagedAutoUpgradeSchedule controls when to perform cluster upgrades scheduled by your designated auto-upgrade channel. You can configure more finely controlled cadence and recurrence settings with this configuration compared to the default configuration. For more information on cluster auto-upgrade, see Automatically upgrade an Azure Kubernetes Service cluster.

  • aksManagedNodeOSUpgradeSchedule controls when to perform the node OS security patching scheduled by your node OS auto-upgrade channel. You can configure more finely controlled cadence and recurrence settings with this configuration compared to the default configuration. For more information on node OS auto-upgrade channels, see Automatically patch and update AKS cluster node images.

We recommend using aksManagedAutoUpgradeSchedule for all cluster upgrade scenarios and aksManagedNodeOSUpgradeSchedule for all node OS security patching scenarios.

The default option is meant exclusively for AKS weekly releases. You can switch the default configuration to the aksManagedAutoUpgradeSchedule or aksManagedNodeOSUpgradeSchedule configuration by using the az aks maintenanceconfiguration update command.

Create a maintenance window

Note

When you're using auto-upgrade, to ensure proper functionality, use a maintenance window with a duration of four hours or more.

Planned maintenance windows are specified in Coordinated Universal Time (UTC).

A default maintenance window has the following legacy properties (no longer recommended):

Name Description Default value
timeInWeek In a default configuration, this property contains the day and hourSlots values that define a maintenance window. Not applicable
timeInWeek.day The day of the week to perform maintenance in a default configuration. Not applicable
timeInWeek.hourSlots A list of hour-long time slots to perform maintenance on a particular day in a default configuration. Not applicable
notAllowedTime A range of dates that maintenance can't run, determined by start and end child properties. This property is applicable only when you're creating the maintenance window by using a configuration file. Not applicable

Note

From the 2023-05-01 API version onwards, please use the below properties for default configuration.

An aksManagedAutoUpgradeSchedule or aksManagedNodeOSUpgradeSchedule maintenance window and default configuration from 2023-05-01 API version onwards has the following properties:

Name Description Default value
utcOffset The time zone for cluster maintenance. +00:00
startDate The date on which the maintenance window begins to take effect. The current date at creation time
startTime The time for maintenance to begin, based on the time zone determined by utcOffset. Not applicable
schedule The upgrade frequency. Three types are available: Weekly, AbsoluteMonthly, and RelativeMonthly. Not applicable
intervalDays The interval in days for maintenance runs. It's applicable only to aksManagedNodeOSUpgradeSchedule. Not applicable
intervalWeeks The interval in weeks for maintenance runs. Not applicable
intervalMonths The interval in months for maintenance runs. Not applicable
dayOfWeek The specified day of the week for maintenance to begin. Not applicable
durationHours The duration of the window for maintenance to run. Not applicable
notAllowedDates A range of dates that maintenance can't run, determined by start and end child properties. It's applicable only when you're creating the maintenance window by using a configuration file. Not applicable

Schedule types

Four available schedule types are available: Daily, Weekly, AbsoluteMonthly, and RelativeMonthly.

Weekly, AbsoluteMonthly, and RelativeMonthly schedule types are applicable only to aksManagedClusterAutoUpgradeSchedule and aksManagedNodeOSUpgradeSchedule configurations. Daily schedules are applicable only to aksManagedNodeOSUpgradeSchedule configurations.

All of the fields shown for each schedule type are required.

A Daily schedule might look like "every three days":

"schedule": {
    "daily": {
        "intervalDays": 3
    }
}

A Weekly schedule might look like "every two weeks on Friday":

"schedule": {
    "weekly": {
        "intervalWeeks": 2,
        "dayOfWeek": "Friday"
    }
}

An AbsoluteMonthly schedule might look like "every three months on the first day of the month":

"schedule": {
    "absoluteMonthly": {
        "intervalMonths": 3,
        "dayOfMonth": 1
    }
}

A RelativeMonthly schedule might look like "every two months on the last Monday":

"schedule": {
    "relativeMonthly": {
        "intervalMonths": 2,
        "dayOfWeek": "Monday",
        "weekIndex": "Last"
    }
}

Valid values for weekIndex include First, Second, Third, Fourth, and Last.

Add a maintenance window configuration

Add a maintenance window configuration to an AKS cluster by using the az aks maintenanceconfiguration add command.

The first example adds a new default configuration that schedules maintenance to run from 1:00 AM to 2:00 AM every Monday. The second example adds a new aksManagedAutoUpgradeSchedule configuration that schedules maintenance to run every third Friday between 12:00 AM and 8:00 AM in the UTC+5:30 time zone.

# Add a new default configuration
az aks maintenanceconfiguration add --resource-group myResourceGroup --cluster-name myAKSCluster --name default --weekday Monday --start-hour 1

# Add a new aksManagedAutoUpgradeSchedule configuration
az aks maintenanceconfiguration add --resource-group myResourceGroup --cluster-name myAKSCluster --name aksManagedAutoUpgradeSchedule --schedule-type Weekly --day-of-week Friday --interval-weeks 3 --duration 8 --utc-offset +05:30 --start-time 00:00

Note

When you're using a default configuration type, you can omit the --start-time parameter to allow maintenance anytime during a day.

Update an existing maintenance window

Update an existing maintenance configuration by using the az aks maintenanceconfiguration update command.

The following example updates the default configuration to schedule maintenance to run from 2:00 AM to 3:00 AM every Monday:

az aks maintenanceconfiguration update --resource-group myResourceGroup --cluster-name myAKSCluster --name default --weekday Monday --start-hour 2

List all maintenance windows in an existing cluster

List the current maintenance configuration windows in your AKS cluster by using the az aks maintenanceconfiguration list command:

az aks maintenanceconfiguration list --resource-group myResourceGroup --cluster-name myAKSCluster

Show a specific maintenance configuration window in an existing cluster

View a specific maintenance configuration window in your AKS cluster by using the az aks maintenanceconfiguration show command with the --name parameter:

az aks maintenanceconfiguration show --resource-group myResourceGroup --cluster-name myAKSCluster --name aksManagedAutoUpgradeSchedule

The following example output shows the maintenance window for aksManagedAutoUpgradeSchedule:

{
  "id": "/subscriptions/<subscription>/resourceGroups/myResourceGroup/providers/Microsoft.ContainerService/managedClusters/myAKSCluster/maintenanceConfigurations/aksManagedAutoUpgradeSchedule",
  "maintenanceWindow": {
    "durationHours": 4,
    "notAllowedDates": [
      {
        "end": "2024-01-05",
        "start": "2023-12-23"
      }
    ],
    "schedule": {
      "absoluteMonthly": {
        "dayOfMonth": 1,
        "intervalMonths": 3
      },
      "daily": null,
      "relativeMonthly": null,
      "weekly": null
    },
    "startDate": "2023-01-20",
    "startTime": "09:00",
    "utcOffset": "-08:00"
  },
  "name": "aksManagedAutoUpgradeSchedule",
  "notAllowedTime": null,
  "resourceGroup": "myResourceGroup",
  "systemData": null,
  "timeInWeek": null,
  "type": null
}

Delete a maintenance configuration window in an existing cluster

Delete a maintenance configuration window in your AKS cluster by using the az aks maintenanceconfiguration delete command.

The following example deletes the autoUpgradeSchedule maintenance configuration:

az aks maintenanceconfiguration delete --resource-group myResourceGroup --cluster-name myAKSCluster --name autoUpgradeSchedule

FAQ

  • How can I check the existing maintenance configurations in my cluster?

    Use the az aks maintenanceconfiguration show command.

  • Can reactive, unplanned maintenance happen during the notAllowedTime or notAllowedDates periods too?

    Yes. AKS reserves the right to break these windows for unplanned, reactive maintenance operations that are urgent or critical.

  • How can I tell if a maintenance event occurred?

    For releases, check your cluster's region and look up information in weekly releases to see if it matches your maintenance schedule. To view the status of your automatic upgrades, look up activity logs on your cluster. You can also look up specific upgrade-related events, as mentioned in Upgrade an AKS cluster.

    AKS also emits upgrade-related Azure Event Grid events. To learn more, see AKS as an Event Grid source.

  • Can I use more than one maintenance configuration at the same time?

    Yes, you can run all three configurations simultaneously: default, aksManagedAutoUpgradeSchedule, and aksManagedNodeOSUpgradeSchedule. If the windows overlap, AKS decides the running order.

  • I configured a maintenance window, but the upgrade didn't happen. Why?

    AKS auto-upgrade needs a certain amount of time, usually not more than 15 minutes, to take the maintenance window into consideration. We recommend at least 15 minutes between the creation or update of a maintenance configuration and the scheduled start time.

    Also, ensure that your cluster is started when the planned maintenance window starts. If the cluster is stopped, its control plane is deallocated and no operations can be performed.

  • Why was one of my agent pools upgraded outside the maintenance window?

    If an agent pool isn't upgraded (for example, because pod disruption budgets prevented it), it might be upgraded later, outside the maintenance window. This scenario is called a "catch-up upgrade." It avoids letting agent pools be upgraded with a different version from the AKS control plane.

    Another reason why an agent pool could be upgraded unexpectedly is when there is no defined maintenance configuration, or if it's been deleted. In that case, a cluster with auto-upgrade but without a maintenance configuration will be upgraded at random times (fallback schedule), which might be an undesired timeframe.

  • Are there any best practices for the maintenance configurations?

    We recommend setting the node OS security updates schedule to a weekly cadence if you're using the NodeImage channel, because a new node image is shipped every week. You can also opt in for the SecurityPatch channel to receive daily security updates.

    Set the auto-upgrade schedule to a monthly cadence to stay current with the Kubernetes N-2 support policy.

    For a detailed discussion of upgrade best practices and other considerations, see AKS patch and upgrade guidance.

  • Can I configure all my clusters in a single subscription to use the same maintenance configuration?

    We don't recommend using the same maintenance configuration for multiple clusters in a single subscription, as doing so can lead to ARM throttling errors causing cluster upgrades to fail. Instead, we recommend staggering the maintenance windows for each cluster to avoid these errors.

Next steps