How do I check to see if an Auto ML model is actually training in Azure Machine Learning Studio?

Joseph John Borriello III 0 Reputation points
2023-04-24T17:52:56.4933333+00:00

I have an Auto ML job running in Azure Machine Learning Studio (AMLS) that is training its models. The data set associated with the job is just south of 900k rows of 2-column .csv data so I expect it to take some time due to the size. However, the first run just passed the 12-hour mark and not a single child job has been completed. I'm curious if there's a place that I can check to see if the job is actually training, stuck, etc. I'm relatively new to AMLS so forgive my ignorance of the best ways to debug. User's image

P.s., I'm running 1 node of Standard E8s_v3 clustered @ 4 cores and 32g of RAM.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,653 questions
{count} votes

1 answer

Sort by: Most helpful
  1. YutongTie-MSFT 47,421 Reputation points
    2023-04-25T05:01:13.1433333+00:00

    Hello @Joseph John Borriello III

    Thanks for reaching out to us.

    There are several places in Azure Machine Learning Studio (AMLS) where you can check the status of your Auto ML job and determine if it's progressing as expected. Here are some steps you can take to troubleshoot:

    Check the experiment logs: In the AMLS interface, navigate to the Experiment page and select the specific experiment you're interested in. From there, click on the "Logs" tab to see a list of logs generated during the experiment run. This can give you insight into any errors or issues that might be occurring during the training process.

    Check the compute status: If you're using a remote compute target to run your Auto ML job, you can check the status of the compute resource to ensure that it's running and properly configured. In the AMLS interface, navigate to the Compute page and select the compute target you're using for the Auto ML job. You can see the status of the compute target here, as well as any associated logs or metrics.

    Check the child runs: Auto ML in AMLS generates child runs for each model trained during the experiment. You can check the status of these child runs by navigating to the experiment page, selecting the specific run, and clicking on the "Child runs" tab. Here you can see the status of each individual model being trained and whether it's completed successfully or not.

    Check the experiment metrics: AMLS provides a range of metrics that can help you evaluate the performance of the models being trained. Navigate to the experiment page and select the specific run you're interested in. From there, click on the "Metrics" tab to see a list of performance metrics for each child run. This can help you identify if any specific models are performing poorly or if there are issues with the training data.

    I hope this helps!

    Regards, Yutong

    0 comments No comments