Submission of image segmentation jobs to existing compute cluster all of a sudden not working in Azure ML Workspace

Minh Tran 5 Reputation points
2025-07-06T18:09:33.2733333+00:00

Existing python azure ml sdk code to submit jobs to a compute cluster all of a sudden stopped working. Python code prints out following error:

HttpResponseError: (UserError) The AutoMLJob input is invalid. Compute gpu-cluster not found in workspace my-workspace-name

This compute cluster - gpu-cluster does exist and I have verified that it is in a success provisioned state. The last time I was able to submit jobs to this cluster with the same code was on June 25, 2025.

When I run az ml compute list, it shows the cluster. So not sure what is causing this issue?

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,351 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Amira Bedhiafi 35,116 Reputation points Volunteer Moderator
    2025-07-06T20:23:57.81+00:00

    Hello !

    Thank you for posting on Microsoft Learn.

    Have you checked if that the workspace object in your Python code points to the correct subscription, resource group, and workspace name ?

    Try to print the workspace details in your code:

    print(ws.name, ws.location, ws.resource_group)
    

    You might be authenticating in a different Azure subscription or workspace than where gpu-cluster exists.

    There could have been a breaking change in the azureml-sdk or azure-ai-ml version if you recently upgraded or if a managed compute environment updated.

    Try to check your current version:

    pip show azure-ai-ml
    pip show azureml-sdk
    

    If using azure-ai-ml, verify if you're using a compatible version (>=1.15.0) and match the API behavior with:

    from azure.ai.ml import MLClient
    

    If the above doesn’t help, try re-creating the compute cluster with a different name and update your code to reference the new cluster. This can help isolate the issue to a potential internal registration bug.

    If your issue is still persisting, raise a support ticket with Azure (include your subscription ID, workspace name, cluster name, and timestamp of error).

    0 comments No comments

  2. Minh Tran 5 Reputation points
    2025-07-07T12:03:33.47+00:00

    This was caused by an errant deployment by Azure team. It was resolved by user who reported the same issue previously mentioned by submitting a support ticket that resulted in rollback of errant code.

    https://learn.microsoft.com/en-us/answers/questions/2338361/error-submit-the-automl-job?comment=question&translated=false#newest-question-comment

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.