"ManagedIdentityCredential.get_token_info failed" in train.py (serverless)

Dominic Archual 1 Reputation point
2024-10-29T01:57:25.8333333+00:00

In my Azure ML pipeline, I have a training step which uses a file called train.py to train my model.

The issue is, whether I use DefaultAzureCredential() or ManagedIdentityCredential(), I get errors similar to the one below:

AzureMLCredential.get_token_info failed: Expecting value: line 1 column 1 (char 0)
ManagedIdentityCredential.get_token_info failed: Expecting value: line 1 column 1 (char 0)

Is there a way to get credentials to pass through from my pipeline (where they work correctly) to the serverless compute that the train.py step runs on?

I found this issue on GitHub which sounds similar, but using the ManagedIdentityCredential() doesn't seem to be resolving the issue for me.

Is there anything else I can try?

ManagedIdentityCredential(client_id="CLIENT_ID_OF_MANAGED_IDENTITY_ASSIGNED_TO_WORKSPACE")

https://github.com/Azure/azure-sdk-for-python/issues/31071

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,963 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Vinodh247 23,266 Reputation points MVP
    2024-10-29T06:30:25.4166667+00:00

    Hi Dominic Archual,

    Thanks for reaching out to Microsoft Q&A.

    This token error is typically due to a mismatch in the identity context between the pipeline's managed environment and the serverless compute used by train.py.

    I wont call the below as solution but are a few steps you can try to resolve this:

    1. Specify Client ID in Managed Identity: As per your current configuration, ensure the client_id is specified when using ManagedIdentityCredential in train.py, and verify that the assigned managed identity on the serverless compute has the necessary permissions for the Azure ML workspace and associated resources.

    ##python from azure.identity import ManagedIdentityCredential credential = ManagedIdentityCredential(client_id="CLIENT_ID_OF_MANAGED_IDENTITY_ASSIGNED_TO_WORKSPACE")

    1. Use ChainedTokenCredential: Azure’s ChainedTokenCredential can help fall back to different credential methods. This might help if there are transient issues with ManagedIdentityCredential.

    ##python from azure.identity import ChainedTokenCredential, ManagedIdentityCredential, DefaultAzureCredential credential = ChainedTokenCredential(     ManagedIdentityCredential(client_id="CLIENT_ID_OF_MANAGED_IDENTITY_ASSIGNED_TO_WORKSPACE"),     DefaultAzureCredential() )

    1. Environment Variables: Double-check that environment variables (AZURE_CLIENT_ID, AZURE_TENANT_ID, and AZURE_CLIENT_SECRET) are set on the serverless compute environment, which might resolve some token acquisition issues when using DefaultAzureCredential.
    2. Authentication Timeout: If the training step takes a long time to start, authentication might time out. You could try running a pre-check authentication step before initiating train.py to help establish a stable credential context.

    If these steps don’t resolve the issue, the GitHub link you found might suggest using a different compute type or upgrading the SDK versions to see if that resolves any underlying issues in Managed Identity support on serverless compute.

     

    Please feel free to click the 'Upvote' (Thumbs-up) button and 'Accept as Answer'. This helps the community by allowing others with similar queries to easily find the solution.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.