ImagePullBackOff with "rpc error: code = Unknown desc = failed to pull and unpack image" from AKS when pulling from ACR

Parth Patel 6 Reputation points
2021-08-31T06:33:10.057+00:00

When pulling a service-jenkins custom image from ACR, AKS gives the following error:

Warning Failed 0s (x2 over 31s) kubelet Failed to pull image "XXX.azurecr.io/service-jenkins:latest": [rpc error: code = Unknown desc = failed to pull and unpack image "XXX.azurecr.io/service-jenkins:latest": failed to extract layer sha256:XXX: unexpected EOF: unknown, rpc error: code = Unknown desc = failed to pull and unpack image "XXX.azurecr.io/service-jenkins:latest": failed to resolve reference "XXX.azurecr.io/service-jenkins:latest": failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized]

We have taken the following steps in an attempt to resolve the issue:

  1. Connected AKS with ACR using SP instead of using secret stored in the same namespace
  2. Uploaded a sample hello-world image which gets pulled successfully by the AKS
  3. Verified the image secret matches with the ACR keys

We pulled and executed the service-jenkins image using local docker engine to check if there is some issue with image building, but the container is executing normally.
We are not able to pinpoint the exact issue. Any help is appreciated!

Azure Container Registry
Azure Container Registry
An Azure service that provides a registry of Docker and Open Container Initiative images.
367 questions
Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
1,809 questions
{count} votes

4 answers

Sort by: Most helpful
  1. shiva patpi 12,966 Reputation points Microsoft Employee
    2021-09-01T00:41:19.443+00:00

    Hello @Parth Patel ,
    Thanks for your query.
    Are you getting the above error message when you are trying to authenticate ACR using ImagePullSecrets ?
    Can you also try to validate by logging into the node and pulling the Image manually , just to check if it goes through ?

    When you attempted Step1 (using SP) as a part of resolution , were you able to pull the same Image from ACR successfully , instead of hello-world Image ?
    if not , please make sure your SP has ACRPULL role on your ACR.
    https://learn.microsoft.com/en-us/azure/container-registry/container-registry-faq#how-do-i-grant-access-to-pull-or-push-images-without-permission-to-manage-the-registry-resource-

    Also try to attach the ACR to AKS using the command mentioned below
    https://learn.microsoft.com/en-us/azure/aks/cluster-container-registry-integration?tabs=azure-cli

    az aks update -n myAKSCluster -g myResourceGroup --attach-acr <acr-name>

    Regards,
    Shiva.

    1 person found this answer helpful.

  2. Szczyra, Jan 1 Reputation point
    2021-09-19T22:34:39.223+00:00

    Same problem here


  3. Mayank Kothari 1 Reputation point
    2021-09-28T09:48:19.273+00:00

    I am facing the same issue. It is due to upgrade to latest node image.
    I had created my AKS cluster in July. Recently I received an email with title:
    "Action recommended: Update your AKS Worker nodes to at least the 08-26-2021 virtual hard drive (VHD)"

    After updating node image, I started facing this issue. The July node image did not have this issue. But now there is no option to downgrade the node image.

    Today I got the latest node image - AKSUbuntu-1804gen2containerd-2021.09.19
    The issue still exists !!!

    0 comments No comments

  4. BRENT VANDERMEIDE 1 Reputation point
    2021-12-29T19:53:39.65+00:00

    fwiw

    I had the same error reporting in AKS thinking it was the Service Principal causing the issue.

    I even tore down and recreated our dev cluster. The command az acr check-acr really gave me confidence that the service principal was not the issue. So I went searching what else might be the cause.

    e.g. in the error @Parth Patel posted, it states failed to resolve reference "XXX.azurecr.io/service-jenkins:latest": Same with me.

    make sure the image you are pulling is actually in the repository

    My root cause was our custom image promotion script, that pushes an image from one ACR to another, had an error and the pipeline did not report a failure; Instead, it continued on and I presumed the image was in the repository. It was not.

    I changed the image tag in the deployment yaml spec to a version that actually existed in the repository, it successfully pulled the image. When I changed it back, I received the same error as before

    Failed to pull image "myacr.azurecr.io/myImage:1.1.23114": [rpc error: code = NotFound desc = failed to pull and unpack image "myacr.azurecr.io/myImage:1.1.23114": failed to resolve reference "myacr.azurecr.io/myImage:1.1.23114": myacr.azurecr.io/myImage:1.1.23114: not found, rpc error: code = Unknown desc = failed to pull and unpack image "myacr.azurecr.io/myImage:1.1.23114": failed to resolve reference "myacr.azurecr.io/myImage:1.1.23114": failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized]

    Hope this helps someone else.

    0 comments No comments