Edit

Blue-green node pool upgrades in Azure Kubernetes Service (AKS) (preview)

Blue-green upgrades enable you to upgrade your AKS node pools by adding new green nodes with the updated configuration (Kubernetes and node image version) to the existing node pool, while the original blue nodes continue running. This strategy allows you to test and validate the new configuration before committing to the new version, with the ability to quickly roll back if issues arise.

This article explains when to use blue-green upgrades, how the process works, configuration options, and considerations for using this upgrade strategy.

When to use blue-green upgrades

Note

Keep in mind that blue-green upgrades require double the node capacity during the upgrade process, which can lead to increased costs and resource requirements.

Consider blue-green upgrades when:

  • You require granular testing and verification of workloads batch by batch.
  • You need to validate new node configurations before switching production traffic.
  • You want instant rollback capability without reprovisioning nodes.
  • You're upgrading critical production workloads that can't tolerate disruption.
  • You need to test application compatibility with new Kubernetes versions.

If you're currently using a manual blue-green deployment process for node pool upgrades and want to automate this workflow, consider using AKS blue-green node pool upgrades instead. For more information about the manual blue green upgrade process, see Manual blue-green node pool upgrades.

When to use standard rolling upgrades

Standard rolling upgrades might be more appropriate in the following scenarios:

  • Development or test environments with downtime tolerance.
  • Cost-sensitive deployments where temporary doubling is prohibitive.
  • Simple stateless applications with good disruption handling.
  • Environments with limited available quota or capacity.

Prerequisites

  • Sufficient quota for doubling node pool capacity.
  • Azure CLI version 2.64.0 or higher. Find your version using the az --version command. If you need to install or upgrade, see Install Azure CLI.
  • The aks-preview Azure CLI extension installed and updated to the latest version.
  • API version 2025-08-02-preview or later.
  • Cluster autoscaler configured (recommended, but not required).

Install the aks-preview Azure CLI extension

Important

AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the following support articles:

Install or update the aks-preview extension using the az extension add and az extension update commands.

# Install the aks-preview extension
az extension add --name aks-preview

# Update the aks-preview extension
az extension update --name aks-preview

Supported features for blue-green upgrades

Blue-green upgrades currently support the following features:

Blue-green upgrade limitations and considerations

Blue-green upgrades currently don't support the following features:

Keep the following considerations in mind when using blue-green upgrades:

Resource requirements Complexity considerations Time factors
* Requires double the node capacity during the upgrade process, leading to increased infrastructure costs.
* You need extra compute quota in your Azure subscription to accommodate the temporary doubling of nodes.
* You might encounter regional capacity limits during peak usage periods.
* Requires careful planning for stateful workloads to ensure data consistency during migration.
* Requires extra monitoring for both the blue and green nodes during the transition period.
* Longer overall upgrade duration compared to in-place upgrades.
* Validation period adds time before final cutover.

Blue-green upgrade workflow

The blue-green upgrade process creates a parallel environment for safe transitions between node pool versions. You can either upgrade and commit to the new green pool after validation or upgrade and roll back to the original blue pool if you encounter issues.

Upgrade and commit scenario

The following diagram illustrates the upgrade and commit workflow:

Diagram showing the upgrade and commit scenario workflow.

The upgrade and commit process is as follows:

  1. Cordon blue nodes: Existing blue nodes are marked as unschedulable.
  2. Add green nodes: New green nodes with the updated configuration are added to the existing node pool.
  3. Parallel operation: Both blue and green nodes run simultaneously within the same node pool.
  4. Gradual migration: Workloads are progressively drained from blue nodes and rescheduled to green nodes in batches.
  5. Validate green nodes: Monitor and test workloads on the new nodes during migration.
  6. Complete transition: After the final validation period, the blue nodes are deleted and green nodes become primary.

Upgrade and roll back scenario

The following diagram illustrates the upgrade and roll back workflow:

Diagram showing the upgrade and roll back scenario workflow.

The upgrade and roll back process is as follows:

  1. Cordon blue nodes: Existing blue nodes are marked as unschedulable.
  2. Add green nodes: New green nodes with the updated configuration are added to the existing node pool.
  3. Parallel operation: Both blue and green nodes run simultaneously within the same node pool.
  4. Detect issues: Identify problems during validation on green nodes.
  5. Execute rollback: Uncordon blue nodes, drain green nodes, and migrate workloads back to blue nodes.
  6. Restore state: Green nodes are deleted and the node pool returns to its original configuration.

Choose your upgrade strategy

When creating or upgrading an AKS node pool, you can specify the upgrade strategy (upgradeStrategy) to use. The available strategies include:

Strategy Description
Rolling (default) Standard rolling upgrade where nodes are updated one by one.
BlueGreen Adds new green nodes with the updated configuration to the existing node pool while keeping the original blue nodes running.

Customize blue-green upgrade properties

You can customize the following blue-green upgrade properties (NodePoolBlueGreenUpgradeSettings):

Property Description Allowed values Default value
drainBatchSize Number or percentage of nodes to drain in each batch during upgrade. Percentage is calculated from the total number of blue nodes at the start of the upgrade. Fractional nodes are rounded up. Integer (for example, 5) or percentage (for example, 50%). Must be a non-zero value. 10%
drainTimeoutInMinutes Maximum time (in minutes) to wait for pods to gracefully terminate on each node before failing the upgrade. Honors pod disruption budgets during this wait time. If exceeded, the upgrade fails. Integer between 1 and 1440 (24 hours). 30 minutes
batchSoakDurationInMinutes Pause time (in minutes) between draining batches of nodes to allow for observation and validation. Integer between 0 and 1440 (24 hours). 15 minutes
finalSoakDurationInMinutes Wait time (in minutes) after all nodes are drained before removing old nodes. Provides a final validation period before committing to the upgrade. Rollback operations are only available during this final soak period. Once this period expires and the blue pool is deleted, rollback is no longer possible. Integer between 0 and 10080 (seven days). 60 minutes

Create a node pool with default blue-green upgrade settings

  • Create a node pool with the default blue-green upgrade strategy and settings using the az aks nodepool add command with the --upgrade-strategy parameter set to bluegreen. The following example creates a new node pool named myNodePool in the AKS cluster myAKSCluster within the resource group myResourceGroup:

    az aks nodepool add \
        --name myNodePool \
        --cluster-name myAKSCluster \
        --resource-group myResourceGroup \
        --upgrade-strategy bluegreen
    

Create a node pool with custom blue-green upgrade settings

  • Create a node pool with custom blue-green upgrade settings using the az aks nodepool add command with the --upgrade-strategy parameter set to bluegreen and set any desired custom blue-green upgrade settings. The following example creates a new node pool named myNodePool in the AKS cluster myAKSCluster within the resource group myResourceGroup, with custom blue-green upgrade settings:

    az aks nodepool add \
        --name myNodePool \
        --cluster-name myAKSCluster \
        --resource-group myResourceGroup \
        --upgrade-strategy bluegreen \
        --drain-timeout-bg 5 \
        --batch-soak-duration 5 \
        --drain-batch-size 50% \
        --final-soak-duration 180
    

Start a blue-green upgrade for an existing node pool

Important

When resuming a paused upgrade, you can update the blue-green settings, but you can't change the upgrade strategy or Kubernetes version.

  • Start a blue-green upgrade for an existing node pool using the az aks nodepool upgrade command with the --kubernetes-version parameter set to your desired version. You can start a blue-green upgrade for a node pool already using the blue-green strategy or for a node pool not yet configured with blue-green strategy. The following examples demonstrate both scenarios:

    # Start a blue-green upgrade for an existing node pool already using blue-green strategy
    az aks nodepool upgrade \
        --name myNodePool \
        --cluster-name myAKSCluster \
        --resource-group myResourceGroup \
        --kubernetes-version <kubernetes-version>
    
    # Start a blue-green upgrade for an existing node pool not yet using blue-green strategy
    az aks nodepool upgrade \
        --name myNodePool \
        --cluster-name myAKSCluster \
        --resource-group myResourceGroup \
        --kubernetes-version <kubernetes-version> \
        --upgrade-strategy bluegreen
    

Pause or cancel a blue-green upgrade

  • Pause or cancel an ongoing blue-green upgrade using the az aks nodepool operation-abort command. The following example pauses or cancels the blue-green upgrade for the node pool named myNodePool in the AKS cluster myAKSCluster within the resource group myResourceGroup:

    az aks nodepool operation-abort \
        --name myNodePool \
        --cluster-name myAKSCluster \
        --resource-group myResourceGroup
    

Roll back a blue-green upgrade

Once an ongoing blue-green upgrade is canceled, the rollback can be initiated using the az aks nodepool rollback command.

The rollback is only available during the final soak period as described in the finalSoakDurationInMinutes property.

The following example performs a rollback of the blue-green upgrade for the node pool named myNodePool in the AKS cluster myAKSCluster within the resource group myResourceGroup:

az aks nodepool rollback \
    --name myNodePool \
    --cluster-name myAKSCluster \
    --resource-group myResourceGroup

Frequently asked questions (FAQs)

Do blue-green upgrades support the maxUnavailable setting?

No, the maxUnavailable setting isn't applicable to blue-green upgrades. New green nodes are added to the existing node pool alongside the existing blue nodes, ensuring all original nodes remain available during the upgrade process.

Can blue-green upgrades be used for node pool updates beyond Kubernetes and node image version?

No, the blue-green upgrade strategy is specific to Kubernetes and node image version upgrades at this time. Other changes, such as certificate rotation or kubelet and OS configuration changes cannot be executed via blue-green upgrade strategy.

Which Kubernetes versions are compatible with blue-green upgrades?

Blue-green upgrades work with all AKS-supported Kubernetes versions, including both community-supported versions and Long Term Support (LTS) versions, so long as you use API version 2025-08-02-preview or later.

Can I use the automatic security patch channel with blue-green upgrades?

Yes, as long as the node pool's upgrade strategy is configured to use blue-green. When configured, security patches follow the blue-green upgrade process instead of the default rolling update mechanism.

What happens to persistent volumes during blue-green upgrades?

Persistent volumes remain accessible. Pods are gracefully drained and rescheduled, maintaining their volume attachments.

Can I perform blue-green upgrades across multiple node pools simultaneously?

Yes, different node pools can undergo blue-green upgrades in parallel, but each pool can only have one active upgrade. You currently can't control the order of upgrades across multiple pools.

How do blue-green upgrades handle node-specific configurations like taints and labels?

All node configurations including taints, labels, and annotations are automatically replicated to the green pool.

What's the cost impact of blue-green upgrades?

You're charged for both the original blue nodes and the new green nodes during the upgrade window, so make sure you plan for temporary cost doubling during the transition period.

What happens during a capacity failure?

In the event of a capacity failure when adding the green nodes, the upgrade fails and the blue nodes remain unaffected. You can retry the upgrade once sufficient capacity is available or choose to roll back.

What happens during rollback?

If the number of green nodes is less than or equal to the number of blue nodes when rollback is initiated, the green nodes are removed, and the blue nodes are uncordoned and revived to resume normal operation.

To learn more about node pool upgrades in AKS, see the following articles: