Error in Azure ML Run: UserError: Operation returned an invalid status code 'Unauthorized'

Marc Eichenberger 0 Reputation points
2024-11-19T05:04:51.7166667+00:00

I initiated a run on Azure ML that was expected to take approximately 10 days. The run details are as follows:

  • VM Type: Standard-NC12s-v3-LOW (2 GPUs), Low Priority
  • Instances: 1 machine
  • Framework: Python, Detectron2

After a few days, the following error occurred: UserError: Operation returned an invalid status code 'Unauthorized'.

Issue Details:

  • This error is not visible in either the user_logs or the system_logs. Both logs appear normal and do not show any signs of an issue.
  • The run does not make any Azure API calls and is a straightforward "local" training process.
  • After encountering the error, I was able to restart the run and resume training without any issues, which suggests the error might originate from Azure's side.

Request:

  • Could you please investigate the root cause of this error?
  • Are there any known issues with the runtime environment that could lead to such Unauthorized errors?
  • Are there additional logs or diagnostic tools I can use to gain further insights?

Thank you for your support!I initiated a run on Azure ML that was expected to take approximately 10 days. The run details are as follows:

  • VM Type: Standard-NC12s-v3-LOW (2 GPUs), Low Priority
  • Instances: 1 machine
  • Framework: Python, Detectron2

After a few days, the following error occurred:
UserError: Operation returned an invalid status code 'Unauthorized'.

Issue Details:

  • This error is not visible in either the user_logs or the system_logs. Both logs appear normal and do not show any signs of an issue.
  • The run does not make any Azure API calls and is a straightforward "local" training process.
  • After encountering the error, I was able to restart the run and resume training without any issues, which suggests the error might originate from Azure's side.

Request:

  • Are there any known issues with the runtime environment that could lead to such Unauthorized errors?
  • Are there additional logs or diagnostic tools I can use to gain further insights?

Thank you for your support!

Best regards,
Marc

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,343 questions
{count} votes

1 answer

Sort by: Most helpful
  1. romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator
    2024-11-19T12:07:09.23+00:00

    @Marc Eichenberger AFAIK this error could occur in your case since the training time seems to be too long and the job might have timed out and the subsequent retries might have failed to authenticate. I think this should be captured in the logs, since you are unable to spot the obvious reason, you can try to create a support case and check for any other issues.

    Is there any reason to use a long training time? Can this be decreased or could you use compute that can complete this faster?

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.