Troubleshoot automated ML experiments
APPLIES TO:
Azure CLI ml extension v2 (current)
Python SDK azure-ai-ml v2 (current)
In this guide, learn how to identify and resolve issues in your automated machine learning experiments.
Troubleshoot automated ML for Images and NLP in Studio
In case of failures in runs for Automated ML for Images and NLP, you can use the following steps to understand the error.
- In the studio UI, the AutoML run should have a failure message indicating the reason for failure.
- For more details, go to the child run of this AutoML run. This child run is a HyperDrive run.
- In the "Trials" tab, you can check all the trials done for this HyperDrive run.
- Go to the failed trial runs.
- These runs should have an error message in the "Status" section of the "Overview" tab indicating the reason for failure. Please click on "See more details" to get more details about the failure.
- You can look at "std_log.txt" in the "Outputs + Logs" tab to look at detailed logs and exception traces.
If your Automated ML runs uses pipeline runs for trials, you can follow the following steps to understand the error.
- Please follow the steps 1-4 above to identify the failed trial run.
- This run should show you the pipeline run and the failed nodes in the pipeline are marked with Red color.
- Double click the failed node in the pipeline.
- These runs should have an error message in the "Status" section of the "Overview" tab indicating the reason for failure. Please click on "See more details" to get more details about the failure.
- You can look at "std_log.txt" in the "Outputs + Logs" tab to look at detailed logs and exception traces.
Next steps
Feedback
Submit and view feedback for