Can't pull ACR image in compute instance of Azure Machine Learning Workspace: VNetPLSetupError

Andrej Aderhold 1 Reputation point
2022-10-18T13:09:54.973+00:00

When I deployed a training job to a compute instance it fails with the following error:

AzureMLCompute job failed.
VNetPLSetupError: Failed to pull Docker image xxxxxxxxxxx.azurecr.io/azureml/azureml_f306f67d2a96fc883ff0773a2a01394e with authentication mode IdentityToken due to: Docker responded with status code 500: {"message":"Head \"https://xxxxxxxxxx.azurecr.io/v2/azureml/azureml_f306f67d2a96fc883ff0773a2a01394e/manifests/latest\": Post \"https://xxxxxxxx.azurecr.io/oauth2/token\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"}
. This error may be caused by a Deny outgoing rule to Container Registry or AzureFrontDoor service tag, or a missing Allow outgoing rule to the same, which is blocking access to container registry xxxxxxxxxx.azurecr.io

It suddenly stopped working after it pulled without problems before.
I can build and push an image, but not pull anymore.

I checked the NSG that is associated with this ML workspace, but it has no outbound rules that could prevent the pull (push works as mentioned). There is no AzureFrontDoor tag configured.

Did anyone come across this 'VNetPLSetupError' error?

Thanks

Azure Container Registry
Azure Container Registry
An Azure service that provides a registry of Docker and Open Container Initiative images.
414 questions
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,657 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Andrej Aderhold 1 Reputation point
    2022-10-21T13:07:49.777+00:00

    Hi,

    its not enabled. Interestingly the error suddenly disappeared. Not sure what caused it, but I had this error only for a couple of days. There was no change to the infrastructure.

    Thanks for responding.

    Best, Andrej

    0 comments No comments