Train model fail / error

Laimis 1 Reputation point
2021-03-14T19:22:44.87+00:00

Hello, when I create a model and try to train it, it always fails. The same model on different two compute targets are these:

  1. AzureMLCompute job failed. UserProcessKilledBySystemSignal: Job failed since the user script received system termination signal usually due to out-of-memory or segfault. Reason: Process Killed with either 6:aborted or 9:killed or 11:segment fault. exit code here is from wrapping bash hence 128 + n Cause: killed TaskIndex: NodeIp: 10.0.0.4 NodeId: tvmps_2b4c1352eab879faa7df6dd68985461ea7ef172338311bc4bd278f1c7c66b3ad_d Reason: Job failed with non-zero exit Code
  2. AmlExceptionMessage:AzureMLCompute job failed. JobFailed: Submitted script failed with a non-zero exit code; see the driver log file for details. Reason: Job failed with non-zero exit Code ModuleExceptionMessage:InvalidTrainingDataset: Dataset contains invalid data for training. Learner type: Binary classifier. Reason: The number of label classes should equal to 2, got 5 classes.
  3. AzureMLCompute job failed. JobFailed: Submitted script failed with a non-zero exit code; see the driver log file for details. Reason: Job failed with non-zero exit Code
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,561 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,611 Reputation points
    2021-03-15T10:18:29.553+00:00

    @Laimis Thanks for the question. Can you please add more details about the steps that you performed and compute cluster details to check.
    Can you please confirm are you using the AML Studio Designer to train the model?
    Is the AML storage account restricts access to specific VNETs and the Compute Cluster isn’t in that VNET?

    Also please confirm did you change your Default storage account key?

    You can Update storage account key with below command.
    Change storage account access keys - Azure Machine Learning | Microsoft Learn
    az ml workspace sync-keys -w myworkspace -g myresourcegroup

    1 person found this answer helpful.