Azure Machine Learning exit code 134

Koesnadi Samuel Matthew 0 Reputation points
2023-02-24T10:34:25.3+00:00

I have an Azure ML job running which has a big training step code. This runs fine on the Azure ML without any error, but still it produces exit code 134 and fails the execution. As I understand exit code 134 relates to SIGABRT . But, also possible that it runs out of memory. I have not checked the memory yet, but will check this out. Still, does anyone have a clearer idea about this?

Azure Machine Learning
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,836 Reputation points
    2023-02-28T10:11:37.82+00:00

    The exit code 134 is related to the SIGABRT signal, which is sent to a process to tell it to abort. This signal is usually sent when the process encounters an error that it cannot handle, such as a segmentation fault. It is also possible that the process ran out of memory, which could cause it to crash. To determine the root cause of the issue, you can check the logs and outputs of the run. You can download the logs by navigating to the Jobs tab, selecting the runID for a specific run, selecting Outputs and logs at the top of the page, and then selecting Download all. The logs will be saved in a zip folder. You can also download individual log files by choosing the log file and selecting Download.

    You can also check the available disk space on the compute instance by accessing the terminal and running df -h. If the disk space is low, you can clear some space by removing files/folders. To access the terminal, go to the compute list page or compute instance details page and click on the Terminal link.

    It is also possible that the issue is related to the service limits in Azure Machine Learning. For example, if you are hitting the limit of metric names per run because you are formatting variables into the metric name, you can consider using a row metric instead, where one column is the variable value and the second column is the metric value. The number of artifacts per run is limited to 10 million, and the max length of the artifact path is 5,000 characters. Some limits can be increased for individual workspaces. To learn how to increase these limits, see "Manage and increase quotas for resources".

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.