I have trained and registered a price prediction model using Azure AutoML (via the drag-and-drop Designer interface). My test file (stored in Azure Blob Storage) has the exact same schema and column order as the training data, except that it is missing the target 'price' column.
I used the no-code Batch Endpoint functionality to perform batch inference. It automatically generated a score.py
script and a custom environment. Even though I explicitly selected the "Append column" option, the output is still saved as a new predictions.csv
file with the following structure:
Column1 Column2
['Testing_File_Without_Price.csv' '1234.5678']
...
(and so on)
What I want instead is:
The predicted price
values should be appended as a new column to the original test file (i.e., restoring the price
column).
The resulting updated file should be saved back to Azure Blob Storage in the same format and structure as the original input file.
Here’s what I’ve already tried without success:
Designer Pipeline: Fails due to custom environment requirements (AutoML model uses MLflow which requires a custom environment not currently supported in Designer).
Manual Batch Endpoint Setup: Attempted using a custom score.py
and environment manually, but it crashes during execution.
How can I correctly perform batch inference where the scored 'price' column is appended to the original input file and the output is saved back to Blob Storage in the same format?
I have trained and registered a price prediction model using Azure AutoML (via the drag-and-drop Designer interface). My test file (stored in Azure Blob Storage) has the exact same schema and column order as the training data, except that it is missing the target 'price' column.
I used the no-code Batch Endpoint functionality to perform batch inference. It automatically generated a score.py
script and a custom environment. Even though I explicitly selected the "Append column" option, the output is still saved as a new predictions.csv
file with the following structure:
Column1 Column2
['Testing_File_Without_Price.csv' '1234.5678']
...
(and so on)
What I want instead is:
The predicted price
values should be appended as a new column to the original test file (i.e., restoring the price
column).
The resulting updated file should be saved back to Azure Blob Storage in the same format and structure as the original input file.
Here’s what I’ve already tried without success:
Designer Pipeline: Fails due to custom environment requirements (AutoML model uses MLflow which requires a custom environment not currently supported in Designer).
Manual Batch Endpoint Setup: Attempted using a custom score.py
and environment manually, but it crashes during execution.
How can I correctly perform batch inference where the scored 'price' column is appended to the original input file and the output is saved back to Blob Storage in the same format?