AKS, cannot pull images from ACR (401, SP permissions in place)

Łukasz Myśliński 0 Reputation points
2023-08-18T11:21:30.29+00:00

This seems like an obvious one and it's driving me nuts. I've devoured all the docs I could find and still not a clue.

I can't connect to my ACR from a cluster location in a different RG. Here's the error:

  Warning  Failed     115s (x4 over 3m26s)  kubelet            Failed to pull image "mlnative.azurecr.io/mlnative/api:39b50aa15a4f7e8cbea35e1d773ecf7ab86f2246": [rpc error: code = NotFound desc = failed to pull and unpack image "mlnative.azurecr.io/mlnative/api:39b50aa15a4f7e8cbea35e1d773ecf7ab86f2246": failed to resolve reference "mlnative.azurecr.io/mlnative/api:39b50aa15a4f7e8cbea35e1d773ecf7ab86f2246": mlnative.azurecr.io/mlnative/api:39b50aa15a4f7e8cbea35e1d773ecf7ab86f2246: not found, rpc error: code = Unknown desc = failed to pull and unpack image "mlnative.azurecr.io/mlnative/api:39b50aa15a4f7e8cbea35e1d773ecf7ab86f2246": failed to resolve reference "mlnative.azurecr.io/mlnative/api:39b50aa15a4f7e8cbea35e1d773ecf7ab86f2246": failed to authorize: failed to fetch anonymous token: unexpected status from GET request to https://mlnative.azurecr.io/oauth2/token?scope=repository%3Amlnative%2Fapi%3Apull&service=mlnative.azurecr.io: 401 Unauthorized]

Here's all the steps I've done:

I've given ACR permissions to my cluster:

az aks update -n "${CLUSTER}" -g "${RESOURCE_GROUP}" --attach-acr /subscriptions/e14a2948-cdd0-4546-ab8f-57170e9ea640/resourceGroups/mln-dev-rg/providers/Microsoft.ContainerRegistry/registries/mlnative

I've checked that it has worked correctly:

`az aks check-acr -n "${CLUSTER}" -g "${RESOURCE_GROUP}" --acr mlnative.azurecr.io`

Output:
[2023-08-18T11:17:27Z] Checking host name resolution (mlnative.azurecr.io): SUCCEEDED

[2023-08-18T11:17:27Z] Canonical name for ACR (mlnative.azurecr.io): r0726weu-5.westeurope.cloudapp.azure.com.

[2023-08-18T11:17:27Z] ACR location: westeurope

[2023-08-18T11:17:27Z] Validating service principal credential: SUCCEEDED

[2023-08-18T11:17:28Z] Validating image pull permission: SUCCEEDED

[2023-08-18T11:17:28Z]

Your cluster can pull images from mlnative.azurecr.io!

At this point I figured that the SP credentials in cluster must not have been refresh, so I stopped and started the cluster again. Then I checked the SP validity (one year), so I recreated them, hoping that this will propagate the new permissions:

SP_ID=$(az aks show --resource-group "${RESOURCE_GROUP}" --name "${CLUSTER}" --query servicePrincipalProfile.clientId -o tsv)
SP_SECRET=$(az ad app credential reset --id "$SP_ID" --query password -o tsv)
az aks update-credentials \    --resource-group "${RESOURCE_GROUP}" \    --name "${CLUSTER}" \     --reset-service-principal \    --service-principal "$SP_ID" \    --client-secret "${SP_SECRET}"

The nodes have been rolled over, and yet I still get a 401 as above... Anything I'm missing? The docs don't even mentioned the need the refresh the SP, they just say that stuff should work as soon as you grant the new ACR permssion...

Thanks for your help

Azure Container Registry
Azure Container Registry
An Azure service that provides a registry of Docker and Open Container Initiative images.
494 questions
Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,382 questions
{count} votes

1 answer

Sort by: Most helpful
  1. mutaz-msft 2,351 Reputation points Microsoft Employee
    2023-08-18T16:50:56.6866667+00:00

    Hi @Łukasz Myśliński,

    Can you check if the image and tag are exist in your ACR?

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.