AKS to ACR connection doesn't work no matter what

Gilad Trachtenberg 0 Reputation points
2023-01-24T15:44:13.01+00:00

Hello there,

I've attempted to follow the following documentations (on both existing as well as creating new clusters and ACRs):

[https://learn.microsoft.com/en-us/azure/aks/use-managed-identity
[https://learn.microsoft.com/en-us/azure/container-registry/container-registry-repository-scoped-permissions
[https://learn.microsoft.com/en-us/cli/azure/group?view=azure-cli-latest
[https://learn.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest#az-aks-update
[https://learn.microsoft.com/en-us/azure/aks/kubernetes-service-principal?tabs=azure-cli
[https://learn.microsoft.com/en-us/troubleshoot/azure/azure-kubernetes/cannot-pull-image-from-acr-to-aks-cluster

Every time I try to set up the connection manually, it fails (regardless of the medium I'm using to attempt to establish it). It doesn't create managed identities for me automatically, so whenever I try to add them by any mean (command, terraform, UI) it says that adding a managed identity (or a kubelet managed identity) is not a valid resource (same error on all mediums). I have verified that I have sufficient permissions to do so (Global Administrator as well as Subscription Owner).

The connection to ACR itself fails because it doesn't identify a Kubelet Managed Identity, which brings up back to the previous paragraph where this procedure fails.

Please advise :)

Best regards,

Gilad

Azure Container Registry
Azure Container Registry
An Azure service that provides a registry of Docker and Open Container Initiative images.
387 questions
Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
1,855 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Gilad Trachtenberg 0 Reputation points
    2023-01-25T09:40:49.3+00:00

    Hello Andrei,

    Thanks for getting back to me so quickly! Apologies for not replying sooner.

    I'll do my best to answer your questions and clarify myself:

    1. What I mean by setting up the connection manually is creating managed identities and attaching them to the AKS cluster manually (I also tried the more "automatic" approach which in my case involved utilizing Terraform). This process fails even though I am following what is explicitly stated in the relevant documentation, indicating that assigning a user identity to the AKS cluster is "an invalid resource ID".
    2. Creating the cluster with a Kubelet identity in Terraform. According to the official Terraform documentation on the matter, it a managed identity should be created for the Kubelet identity automatically.
    3. The full error is "--assign-identity is not a valid resource ID" as well as "--assign-kubelet-identity is not a valid resource ID". This is done after creating managed identities and trying to assign them as per this documentation.
    4. I have no preferences. I just want it to work, which currently it doesn't.
    5. It is successful, but ACR points to a Service Principal that is inaccessible by me, called "msi". Even after trying to create a service principal myself manually, ACR still looks to the default "msi" service principal in order to configure the connection. So in reality, the cluster still isn't able to pull images from the repository.
    6. The output shows: "Merged "<CLUSTERNAME> as current context in /tmp/tmptz521qdj
      Unable to connect to the server: dial tcp: lookup <CLUSTERDNS> <IP>: no such host". But it doesn't really matter because I created several test clusters just to try and configure that connection from scratch, and even though the output is valid, connection still fails because of "Kubelet Identity Authentication".
    7. Output attached at the bottom of this message. Even though it has ACR pull permission, there is still an "ImagePullErr" when pods are trying to pull images from the repository.
    8. I have attempted that, but it fails on the same "Kubelet Identity Authentication" principle as detailed in answer no.6.

    Output for Question 7:

        "kubeletidentity": {
          "resourceId": "/subscriptions/<SUBSCRIPTIONID>/resourcegroups/<CLUSTERRESOURCEGROUP>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<CLUSTENAME>-agentpool"
      "podIdentityProfile": null,
    
    0 comments No comments

  2. Andrei Barbu 2,576 Reputation points Microsoft Employee
    2023-01-25T10:57:06.1066667+00:00

    Hello Gilad!

    Afte reading the comment you added, below is my answer:

    I understand that you get "--assign-identity is not a valid resource ID" as well as "--assign-kubelet-identity is not a valid resource ID" but I also understand that you don't care if you let AKS create your identities or you create them by yourself.

    I am not sure why you think there is an issue with the kubelet identity. For the Question 7 output, it looks like the AKS cluster has the default kubelet identity assigned. Let's try to stick to an AKS cluster which has the identities created by AKS.

    As per the error you provided "Unable to connect to the server: dial tcp: lookup <CLUSTERDNS> <IP>: no such host" it may be some networking issue.

    Could you please try to connect to your AKS node and try to test the connectivity to your ACR via "telnet <ACR-login-name> 443"?

    As an workaround, you may want to try to pull using a pull secret as per the following docs:
    [https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/

    https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-secret-docker-registry-em-

    You can add the admin account credentials of the ACR to the Kubernetes secret.

    For the kubelet issue, I would recommend you to open a support request so a dedicated engineer can see during a meeting what you are doing and help you fix it.

    I hope this is helpful. If any clarification needed, let me know and I will do my best to answer.

    Please "Accept as Answer" and Upvote if it helped, so that it can help others in the community looking for help on similar topics.

    Thank you!

    0 comments No comments

  3. Gilad Trachtenberg 0 Reputation points
    2023-01-29T08:47:50.96+00:00

    As you suggested, I will be opening an official support ticket in order to see this issue through.

    0 comments No comments