Batch Endpoint Deployment Fails with Status Code 42 for AutoML Model

Question

Batch Endpoint Deployment Fails with Status Code 42 for AutoML Model

Kushagra Gupta 0

We are consistently encountering a critical error while deploying a trained Azure AutoML regression model using a batch endpoint created through the Azure Machine Learning Studio (No-Code UI). The deployment fails with the following error:

User's image

Approach Used:

Model Training: Performed via Azure ML Studio (No-Code UI) using AutoML for a regression task.

Batch Deployment: Deployed model for batch inference using Batch Endpoint wizard within Azure ML Studio (No-Code UI).

Data Files:

Training File: Clean, properly formatted, and successfully used during AutoML training.

  **Testing File:** CSV format, accessible in Azure Blob Storage, and successfully used for evaluation and real-time inference.
  
     **Trained Model:** AutoML-generated model with strong regression metrics (e.g., R², RMSE) and works correctly with real-time endpoints.
     
     **Compute Target:** Azure ML Compute Cluster (dedicated, functional, and validated).

**
Steps Taken:**

Trained the model using clean training data via AutoML (No-Code UI).

Created batch endpoint and configured the inference job using the same model and test data.

Batch inference job fails consistently with exit code 42.

Re-verified the data, retrained the model, and repeated deployment — the issue persists.

The same model and test file return correct results when run through a real-time endpoint.Approach Used:
- Model Training: Performed via Azure ML Studio (No-Code UI) using AutoML for a regression task.
- Batch Deployment: Deployed model for batch inference using Batch Endpoint wizard within Azure ML Studio (No-Code UI).
- Data Files:
- Training File: Clean, properly formatted, and successfully used during AutoML training.
- Testing File: CSV format, accessible in Azure Blob Storage, and successfully used for evaluation and real-time inference.
- Trained Model: AutoML-generated model with strong regression metrics (e.g., R², RMSE) and works correctly with real-time endpoints.
- Compute Target: Azure ML Compute Cluster (dedicated, functional, and validated).
Impact: This issue is blocking critical batch inference workflows for production use. Given that the model and data work flawlessly in real-time mode, the issue appears isolated to the batch endpoint mechanism. A fix or workaround is urgently needed. .Requested Action: Investigate the root cause of the exit code 42.
- Provide resolution or corrective steps, especially considering the use of the No-Code UI pipeline.
Let us know if any further logs or configuration inputs are required.

Manas Mohanty 13,340 Reputation points Moderator

2025-07-28T07:47:14.65+00:00

Hi Kushagra Gupta

Sorry for the late response.

Execution failed. User process exited with status code 42. File "driver/amIbi_main.py", line 226, in main sys.exit(exitcode_candidate) SystemExit: 42

The error indicates issue with scoring script.

We cannot use the same scoring script for online endpoint for batch endpoint.

We might get additional insights from environment logs.

More logs can be collecting by adding try catch block inside scoring script itself.

It is better to use mlflow to load the model if the model loading is failing

Reference scoring script

https://github.com/HristinaJilova/AzureML/blob/main/HowToDeployAutoamtedMLmodelToBatchEndpoint/BatchScoringFile.py

https://techcommunity.microsoft.com/blog/fasttrackforazureblog/an-easy-low-code-tutorial-about-how-to-deploy-a-automated-ml-model-on-a-batch-en/3763822

Troubleshoot batch endpoints

Thank you.
Manas Mohanty 13,340 Reputation points Moderator

2025-07-30T02:22:03.4466667+00:00

Hi Kushagra Gupta

Wanted to emphasize that you might need add try/catch blocks and logging to scoring script to find issues with respect to batch endpoint.

Please share any error logs generated triggering from scoring script itself.

Thank you
Manas Mohanty 13,340 Reputation points Moderator

2025-07-31T01:09:04.29+00:00

Hi Kushagra Gupta

We could not hear from you. Please let us know if you have been able to find the issue in scoring script with respect to batch endpoint.

If nothing is wrong with scoring script. Please share conda.yaml content to replicate the issue.

Thank you

3 answers

Your answer

Manas Mohanty 13,340 Reputation points Moderator

2025-07-28T07:47:14.65+00:00

Hi Kushagra Gupta

Sorry for the late response.

Execution failed. User process exited with status code 42. File "driver/amIbi_main.py", line 226, in main sys.exit(exitcode_candidate) SystemExit: 42

The error indicates issue with scoring script.

We cannot use the same scoring script for online endpoint for batch endpoint.

We might get additional insights from environment logs.

More logs can be collecting by adding try catch block inside scoring script itself.

It is better to use mlflow to load the model if the model loading is failing

Reference scoring script

https://github.com/HristinaJilova/AzureML/blob/main/HowToDeployAutoamtedMLmodelToBatchEndpoint/BatchScoringFile.py

https://techcommunity.microsoft.com/blog/fasttrackforazureblog/an-easy-low-code-tutorial-about-how-to-deploy-a-automated-ml-model-on-a-batch-en/3763822

Troubleshoot batch endpoints

Thank you.
Manas Mohanty 13,340 Reputation points Moderator

2025-07-30T02:22:03.4466667+00:00

Hi Kushagra Gupta

Wanted to emphasize that you might need add try/catch blocks and logging to scoring script to find issues with respect to batch endpoint.

Please share any error logs generated triggering from scoring script itself.

Thank you
Manas Mohanty 13,340 Reputation points Moderator

2025-07-31T01:09:04.29+00:00

Hi Kushagra Gupta

We could not hear from you. Please let us know if you have been able to find the issue in scoring script with respect to batch endpoint.

If nothing is wrong with scoring script. Please share conda.yaml content to replicate the issue.

Thank you

Answer 1

Hi Kushagra Gupta,

Welcome to Microsoft Q&A Community!

Thank you for reaching out. Let's address your issue step-by-step.

From your error log:

Execution failed. User process exited with status code 42.
File "driver/amIbi_main.py", line 226, in main
sys.exit(exitcode_candidate)
SystemExit: 42

This suggests a user-level failure during model training or deployment in Azure Machine Learning, often due to:

Malformed or incomplete data labels

Missing expected fields in your training documents

Runtime errors in the custom script or configuration

Can you please confirm if the below steps were already performed to Improve Your Custom Extraction Model so that we can check on the different methods.

Review and Refine Labeling Strategy

Ensure all labeled fields exist across samples. If a field is missing on some pages, it may confuse the model.

Use consistent bounding boxes – irregular labeling introduces noise.

Label multiple document instances if the same field appears more than once (e.g., header/footer).

Reference: Labeling best practices

OCR Layer Not Available in Downloaded JSON?

If you're exporting from Labeling Tool, it may not include OCR content by default.

To get the OCR + labeled content, use the Analyze API on your labeled/test document through your trained model and set includeTextDetails=true in your request.

More info: Build custom model

Auto-labeling Fallback

If auto-labeling is off or failing, review logs or try resetting the label layout and re-assigning entities manually.

Limited Data? Use Prebuilt + Compose Model

Combine your Custom Extraction model with a Prebuilt Layout or Read model using Compose Model to extract structure reliably with few samples.

Reference: Choosing model types

AzureML Error: SystemExit 42

From the error:

AzureMLCompute job failed

Please review the full log in:

user_logs/std_log_0.txt

Look for:

Import errors

File path mismatches

Environment dependency issues

Make sure:

Your compute instance has necessary permissions and access to training files.
The training dataset is correctly mounted or uploaded.
When labeling or troubleshooting in the Document Intelligence Studio:
- Prefer the latest stable API version (v4.0 or v3.1).
- If the UI seems buggy, try re-uploading clean files and creating a new project.

Let us know if you need further help!

Best regards,

Chakravarthi Rangarajan Bhargavi

If this answer helped, please click "Accept Answer" and upvote to help others in the community!

Answer 2

We are still encountering consistent failures when deploying a regression AutoML model using Batch Endpoints via Azure ML Studio (No-Code UI). The deployment fails with Exit Code 42, even though the same model performs accurately through real-time endpoints using the same test dataset

Troubleshooting Steps Already Performed:

1.Confirmed test file format is identical to what was used during training and real-time inference.

2.Verified compute cluster is active, correctly attached, and accessible.

3.Retried the batch job with fresh deployments and re-uploaded data – issue persists.

4.Confirmed the model runs without issue via real-time endpoints on the same test data.

5.Reviewed logs but no clear pointer apart from the exit code and system exit. Have attached the logs for your kind perusal.

user_logs/std_log_0.txt:

Azure Machine Learning Batch Inference Start
[2025-07-16 11:25:04.298903] No started flag set. Skip creating started flag.
Azure Machine Learning Batch Inference End
Cleaning up all outstanding Run operations, waiting 300.0 seconds
2 items cleaning up...
Cleanup took 0.1315138339996338 seconds
Traceback (most recent call last):
  File "driver/amlbi_main.py", line 275, in <module>
    main()
  File "driver/amlbi_main.py", line 226, in main

This issue is blocking our batch inference pipeline. Given that the real-time endpoint executes perfectly with the same model and data, this issue is isolated to the batch endpoint infrastructure.

Answer 3

@MicrosoftSupport: Hello, posting this here because I cannot raise a support ticket anymore. It keeps pointing me to documentations and the docs dont have solution to this issue.

This is still an active issue. I'm facing the same error on our batch endpoints during inference. Please see screenshot below. Infact I tested it on some old batch endpoint that worked fine until a few weeks ago, and are now failing.

Also I cannot edit the scoring script to fix the issue, because it is an auto-generated scoring script as I deployed the endpoint using no-code UI based endpoint.

Please let me know of there is a support email I can send further details to. Screenshot 2025-08-20 at 11.22.05 AM

Share via

Batch Endpoint Deployment Fails with Status Code 42 for AutoML Model

3 answers

Your answer