Creating new AKS cluster in private network results in FAILED state

Nam 0 Reputation points
2023-09-07T17:08:38.4733333+00:00

Here's what it gave from diagnostic, nothing meaningful to know what was the reason of the failure:

Shows up to the last 50 failing RP operations over the last 10 days. This should show what caused the cluster to go into a failed provisioning state. Should you need to create a support request, please also include any failing operation IDs.

9/7/2023 2:53:54 AM PutManagedClusterHandler.PUT (Creating) 09074ff2-b3eb-451c-b445-5f78da9195b0 CreateVMSSAgentPoolFailed VMExtensionProvisioningError Internal error
Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,116 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. shiva patpi 13,251 Reputation points Microsoft Employee
    2023-09-07T20:03:55.5666667+00:00

    Hello @Nam,

    This error VMExtensionProvisioningError generally means that some extension deployment got failed ! If you go to corresponding VMSS - you might see instances might be in Running state but the VMSS (overview) might show you that VMSS is in failed state. If you click on that failed status it will show you extension failure along with the detailed error message & error code.

    Background:

    Basically whenever the AKS VMSS nodes gets bootstrapped as a part of post deployment operation those nodes will try to reach out to mcr.microsoft.com or ubuntu.com to deploy additional softwares on those nodes. If those nodes don't have any outboundconnectivity to reach out to the internet , that's where in most of the cases you will see "Extension" failure error messages.

    Probable issues can be:-

    -> Check if you are using customDNS servers , if yes - check if those custom DNS servers have got resolvers to Azure Provided DNS or not

    -> If you have any firewall , kindly check the logs if that is blocking

    -> In the firewall , check if all the outbound rules have opened or required FQDN rules are opened based upon the below document: https://learn.microsoft.com/en-us/azure/aks/outbound-rules-control-egress

    -> In the networking section , check for any blocking NSGs

    -> If it is a private AKS cluster , kindly see the pre-requisites

    https://learn.microsoft.com/en-us/azure/aks/private-clusters?tabs=azure-portal

    Let us know if that helps !

    Regards,

    Shiva.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.