Azure AutoML Batch Inference: Save Predictions in Original Input File Format

Question

Azure AutoML Batch Inference: Save Predictions in Original Input File Format

Kushagra Gupta 0

I have trained and registered a price prediction model using Azure AutoML (via the drag-and-drop Designer interface). My test file (stored in Azure Blob Storage) has the exact same schema and column order as the training data, except that it is missing the target 'price' column.

I used the no-code Batch Endpoint functionality to perform batch inference. It automatically generated a score.py script and a custom environment. Even though I explicitly selected the "Append column" option, the output is still saved as a new predictions.csv file with the following structure:

Column1 Column2

['Testing_File_Without_Price.csv' '1234.5678']

...

(and so on)

What I want instead is:

The predicted price values should be appended as a new column to the original test file (i.e., restoring the price column).

The resulting updated file should be saved back to Azure Blob Storage in the same format and structure as the original input file.

Here’s what I’ve already tried without success:

Designer Pipeline: Fails due to custom environment requirements (AutoML model uses MLflow which requires a custom environment not currently supported in Designer).

Manual Batch Endpoint Setup: Attempted using a custom score.py and environment manually, but it crashes during execution.

How can I correctly perform batch inference where the scored 'price' column is appended to the original input file and the output is saved back to Blob Storage in the same format?

I have trained and registered a price prediction model using Azure AutoML (via the drag-and-drop Designer interface). My test file (stored in Azure Blob Storage) has the exact same schema and column order as the training data, except that it is missing the target 'price' column.

I used the no-code Batch Endpoint functionality to perform batch inference. It automatically generated a score.py script and a custom environment. Even though I explicitly selected the "Append column" option, the output is still saved as a new predictions.csv file with the following structure:

Column1 Column2

['Testing_File_Without_Price.csv' '1234.5678']

...

(and so on)

What I want instead is:

The predicted price values should be appended as a new column to the original test file (i.e., restoring the price column).

The resulting updated file should be saved back to Azure Blob Storage in the same format and structure as the original input file.

Here’s what I’ve already tried without success:

Designer Pipeline: Fails due to custom environment requirements (AutoML model uses MLflow which requires a custom environment not currently supported in Designer).

Manual Batch Endpoint Setup: Attempted using a custom score.py and environment manually, but it crashes during execution.

How can I correctly perform batch inference where the scored 'price' column is appended to the original input file and the output is saved back to Blob Storage in the same format?

2 answers

Your answer

Answer 1

Hello Kushagra !

Thank you for posting on Microsoft Learn.

To append predicted values from Azure AutoML batch inference to your original input CSV file, and save the updated file back to Azure Blob Storage, you will need to slightly bypass the default behavior of no-code batch inference. Here's a working approach using custom scoring script with batch inference pipeline, fully aligned with MLflow and your custom environment.

Azure AutoML no-code batch endpoints generate outputs in a prediction metadata format (filename + prediction array), not your original structure.

Even with "Append column" selected, it just appends the value to a file reference, not to the actual dataframe you uploaded.

You need to write a custom score.py that:

Loads the input CSV (no target column)
Loads your AutoML model via MLflow
Generates predictions
Appends the 'price' column to the original dataframe
Saves the result to the expected output path (for example ./outputs/)

Then package this in a batch pipeline job with custom environment.

Ravada Shivaprasad 535 Reputation points Microsoft External Staff Moderator

2025-06-03T22:49:22.9333333+00:00

Hi Kushagra Gupta

Just checking in to see if the above answer provided by @Amira Bedhiafi helped. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Thanks

Answer 2

Hello @Kushagra Gupta ,

When you deploy batch endpoint with default scoring script it just gives you prediction along with input file name.

So, for custom output you need to customize the batch scoring script.

For your case use below sample code.


def init():
    global model
    global output_path

    output_path = os.environ["AZUREML_BI_OUTPUT_PATH"]
    model_path = os.environ["AZUREML_MODEL_DIR"]
    model_file = glob.glob(f"{model_path}/*/*.pkl")[-1]

    with open(model_file, "rb") as file:
        model = pickle.load(file)


def run(mini_batch: List[str]):
    for file_path in mini_batch:
        data = pd.read_csv(file_path)
        pred = model.predict(data)

        data["prediction"] = pred

        output_file_name = Path(file_path).stem
        output_file_path = os.path.join(output_path, output_file_name + ".csv")
        data.to_csv(output_file_path)

    return mini_batch

Here, it read the csv file from input batch then adds prediction column with values and writing it as csv file.

You alter above code according to the requirement and also you can refer this github notebook for custom batch output.

Please try above and let us know in comments if you face any error or having any query.

Thank you

JAYA SHANKAR G S 4,035 Reputation points Microsoft External Staff Moderator

2025-06-06T04:41:49.5166667+00:00

Hello @Kushagra Gupta ,

Did you try above custom scoring script? please check and let me know.

Thank you
JAYA SHANKAR G S 4,035 Reputation points Microsoft External Staff Moderator

2025-06-09T03:43:57.0733333+00:00

Hello @Kushagra Gupta ,

I believe the custom script helped you to save predictions with original data, let us know if you have any query in comments.

Thank you

Share via

Azure AutoML Batch Inference: Save Predictions in Original Input File Format

2 answers

Your answer