AKS unable to pull images from mcr.micosoft.com

Pankaj Jainani 20 Reputation points
2024-06-23T10:32:47.17+00:00

Hello

I am having few AKS private clusters which were running normal, but overnight I started observing following issues in many kube-system pods:

Failed to pull image "mcr.microsoft.com/***": rpc error: code = Unknown desc = failed to pull and unpack image "mcr.microsoft.com/***": failed to resolve reference "mcr.microsoft.com/***": failed to do request: Head https://mcr.microsoft.com/***: dial tcp 150.171.69.10:443: i/o timeout

please share some insight to understand root cause of this sudden issue.

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
1,964 questions
0 comments No comments
{count} votes

Accepted answer
  1. Ammar-Abdelqader01 1,156 Reputation points Microsoft Employee
    2024-06-23T10:52:15.3433333+00:00

    Hello @Pankaj Jainani

    Thank you for your question, from the log you have been provided it looks your node, can't reach FQDN mcr.microsoft.com .

    couldn't establish the necessary outbound connectivity to obtain packages. For public clusters, the nodes try to communicate with the Microsoft Container Registry (MCR) endpoint (mcr.microsoft.com) on port 443.

    There are many reasons why the traffic might be blocked. In any of these situations, the best way to test connectivity is to use the Secure Shell protocol (SSH) to connect to the node. To make the connection, follow the instructions in Connect to Azure Kubernetes Service (AKS) cluster nodes for maintenance or troubleshooting. Then, test the connectivity on the cluster by following these steps:

    nc -vz mcr.microsoft.com 443

    dig mcr.microsoft.com 443

    Solution

    The following table lists specific reasons why traffic might be blocked, and the corresponding solution for each reason.

    Expand table

    Issue Solution
    Traffic is blocked by firewall rules or a proxy server In this scenario, a firewall or a proxy server does egress filtering. To verify that all required domains and ports are allowed, see Control egress traffic for cluster nodes in Azure Kubernetes Service (AKS).
    Traffic is blocked by firewall rules or a proxy server In this scenario, a firewall or a proxy server does egress filtering. To verify that all required domains and ports are allowed, see Control egress traffic for cluster nodes in Azure Kubernetes Service (AKS).
    Traffic is blocked by a cluster network security group (NSG) On any NSGs that are attached to your cluster, verify that there's no blocking on port 443, port 53, or any other port that might have to be used to connect to the endpoint. For more information, see Control egress traffic for cluster nodes in Azure Kubernetes Service (AKS).
    The AAAA (IPv6) record is blocked on the firewall On your firewall, verify that nothing exists that would block the endpoint from resolving in Azure DNS.
    Private cluster can't resolve internal Azure resources In private clusters, the Azure DNS IP address (168.63.129.16) must be added as an upstream DNS server if custom DNS is used. Verify that the address is set on your DNS servers. For more information, see Create a private AKS cluster and What is IP address 168.63.129.16?

    If this has been helpful, please take a moment to accept answers as this helps increase the visibility of this question for other members of the Microsoft Q&A community. Thank you for helping to improve Microsoft Q&A!

    User's image


1 additional answer

Sort by: Most helpful
  1. akinbade abiola 7,430 Reputation points
    2024-06-23T11:09:17.1166667+00:00

    Hello Pankaj Jainani,

    Thanks for your question.

    Based on the error message, the private AKS clusters are having trouble pulling images from MCR. To resolve we need to find out why. It most likely points to a connectivity or Identity issue. I will recommend we do the following:

    Test connectivity and DNS resolution

    ping mcr.microsoft.com
    nslookup mcr.microsoft.com
    

    Check if the image being pulled actually exists.

    If these are fine, further troubleshoot with the link here: https://learn.microsoft.com/en-us/troubleshoot/azure/azure-kubernetes/extensions/cannot-pull-image-from-acr-to-aks-cluster

    You can mark it 'Accept Answer' and 'Upvote' if this helped you

    Regards,

    Abiola

    0 comments No comments