AKS upgrade failing which GPU node pool "(OperationNotAllowed) Code="OperationNotAllowed" Message="The 'Placement' option override for the ephemeral OS disk is not supported. Please upgrade the VM Size with desired placement option for provisioning the "

Vikas Sharma 10 Reputation points
2024-06-12T12:40:47.5833333+00:00

AKS version upgrade failing becuase it's having GPU noedpool

(OperationNotAllowed) Code="OperationNotAllowed" Message="The 'Placement' option override for the ephemeral OS disk is not supported. Please upgrade the VM Size with desired placement option for provisioning the Ephemeral OS disk."

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,145 questions
{count} votes

2 answers

Sort by: Most helpful
  1. John Macnamara 0 Reputation points
    2024-06-24T18:44:54.7933333+00:00

    @Goncalo Correia Im also running into a similar issue while attempting to update an existing nodepool. Im using the azure cli to perform the update. This issue arose within the last 2 weeks and only appears to affect our nodepool using the Standard_NC6s_v3 machine type. We have an additional gpu node pool using Standard_NC24ads_A100_v4 machines as well as non-gpu node pools which are not experiencing this issue.

    This node pool uses an Ephemeral OS Disk type and is running k8s 1.27.13

    I also hit this issue when I stop and start the nodepool with the azure cli as well as in the azure portal.

    0 comments No comments

  2. Goncalo Correia 351 Reputation points Microsoft Employee
    2024-06-25T10:22:21.32+00:00

    Hi @John Macnamara and Vikas,

    Recently there was an issue with the AKS Resource Provider (RP), previously for agent pool or managed cluster update/upgrade operation, AKS RP does not recalculate DiskPlacement and just reuse the value. This code logic has been changed after a code refactoring and we recalculate this value every time and overwrite the old value. If the calculation results are different, you could encounter this issue. AKS RP already fixed this bug in our codebase.

    This fix should be already rolled out for all regions, so if you are still experiencing this issue I urge you to open a Support Request for support to investigate your cluster in particular,

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.