ImagePullBackOff persists on AKS node despite confirmed AcrPull role assignment to kubelet identity

David Campbell 0 Reputation points
2025-04-16T11:07:35.5966667+00:00

Azure Support Ticket: ImagePullBackOff Issue on AKS

Issue Summary

Title:

ImagePullBackOff persists on AKS node despite confirmed AcrPull role assignment to kubelet identity

Description:

We are experiencing an ImagePullBackOff error on one of our AKS pods pulling from our private Azure Container Registry (ACR), even though the AKS kubelet identity has been granted the AcrPull role. Other pods on different nodes are pulling the same image (frontend:latest) successfully. The affected node consistently fails with a 401 Unauthorized error when attempting to pull the image.

Environment Details

  • Cluster Name: apip-dev-aks-uaenorth
  • Resource Group: apip-dev-rg-uaenorth
  • ACR Name: apipdevacr
  • Region: uaenorth
  • Image: apipdevacr.azurecr.io/frontend:latest
  • Kubelet Object ID: 747ad783-9416-48a7-bcab-a1bc64898b45
  • ACR Scope: Correctly scoped to the ACR registry resource
  • Assignment Timestamp: 2025-04-16T05:49:16Z

Troubleshooting Performed

  1. Confirmed kubelet identity:
    • Retrieved from az aks show --query identityProfile.kubeletidentity.objectId
  2. Verified ACR role assignment:
    • Used az role assignment list to confirm the AcrPull role is correctly assigned to the kubelet identity for the ACR scope
  3. Image verified in ACR:
    • Pulled frontend:latest manually using Docker and az acr login
    • Working successfully on other AKS nodes
  4. Pod consistently fails on node aks-agentpool-22692403-vmss000000:
    • Output of kubectl describe pod confirms:
      
           failed to fetch anonymous token: unexpected status from GET ... 401 Unauthorized
      
      
  5. Deleted and recreated pod:
    • Pod reschedules but fails again on the same node
  6. Confirmed issue is isolated to a specific node:
    • Other pods using the same image are running fine on different nodes
  7. Waited >30 minutes for potential role propagation:
    • Error persists beyond typical AAD propagation window

Request

We request assistance from Azure support to:

  • Investigate potential misconfiguration or delay in RBAC token propagation at the node or VMSS instance level
  • Validate whether the kubelet on the specific node has successfully received the updated token permissions
  • Suggest additional diagnostics or a workaround to refresh or reset the identity on the affected node
Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,374 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.