How to deploy a pipeline to perform batch scoring with preprocessing

APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)

In this article, you'll learn how to deploy an inference (or scoring) pipeline under a batch endpoint. The pipeline performs scoring over a registered model while also reusing a preprocessing component from when the model was trained. Reusing the same preprocessing component ensures that the same preprocessing is applied during scoring.

You'll learn to:

  • Create a pipeline that reuses existing components from the workspace
  • Deploy the pipeline to an endpoint
  • Consume predictions generated by the pipeline

About this example

This example shows you how to reuse preprocessing code and the parameters learned during preprocessing before you use your model for inferencing. By reusing the preprocessing code and learned parameters, we can ensure that the same transformations (such as normalization and feature encoding) that were applied to the input data during training are also applied during inferencing. The model used for inference will perform predictions on tabular data from the UCI Heart Disease Data Set.

A visualization of the pipeline is as follows:

A screenshot of the inference pipeline comprising a scoring component alongside the outputs and prepare component from a training pipeline.

The example in this article is based on code samples contained in the azureml-examples repository. To run the commands locally without having to copy/paste YAML and other files, first clone the repo and then change directories to the folder:

git clone --depth 1
cd azureml-examples/cli

The files for this example are in:

cd endpoints/batch/deploy-pipelines/batch-scoring-with-preprocessing

Follow along in Jupyter notebooks

You can follow along with the Python SDK version of this example by opening the sdk-deploy-and-test.ipynb notebook in the cloned repository.


Before following the steps in this article, make sure you have the following prerequisites:

  • An Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the free or paid version of Azure Machine Learning.

  • An Azure Machine Learning workspace. If you don't have one, use the steps in the Manage Azure Machine Learning workspaces article to create one.

  • Ensure that you have the following permissions in the workspace:

    • Create or manage batch endpoints and deployments: Use an Owner, Contributor, or Custom role that allows Microsoft.MachineLearningServices/workspaces/batchEndpoints/*.

    • Create ARM deployments in the workspace resource group: Use an Owner, Contributor, or Custom role that allows Microsoft.Resources/deployments/write in the resource group where the workspace is deployed.

  • You need to install the following software to work with Azure Machine Learning:

    The Azure CLI and the ml extension for Azure Machine Learning.

    az extension add -n ml


    Pipeline component deployments for Batch Endpoints were introduced in version 2.7 of the ml extension for Azure CLI. Use az extension update --name ml to get the last version of it.

Connect to your workspace

The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section, we'll connect to the workspace in which you'll perform deployment tasks.

Pass in the values for your subscription ID, workspace, location, and resource group in the following code:

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

Create the inference pipeline

In this section, we'll create all the assets required for our inference pipeline. We'll begin by creating an environment that includes necessary libraries for the pipeline's components. Next, we'll create a compute cluster on which the batch deployment will run. Afterwards, we'll register the components, models, and transformations we need to build our inference pipeline. Finally, we'll build and test the pipeline.

Create the environment

The components in this example will use an environment with the XGBoost and scikit-learn libraries. The environment/conda.yml file contains the environment's configuration:


- conda-forge
- python=3.8.5
- pip
- pip:
  - mlflow
  - azureml-mlflow
  - datasets
  - jobtools
  - cloudpickle==1.6.0
  - dask==2.30.0
  - scikit-learn==1.1.2
  - xgboost==1.3.3
name: mlflow-env

Create the environment as follows:

  1. Define the environment:


    name: xgboost-sklearn-py38
    conda_file: conda.yml
    description: An environment for models built with XGBoost and Scikit-learn.
  2. Create the environment:

    az ml environment create -f environment/xgboost-sklearn-py38.yml

Create a compute cluster

Batch endpoints and deployments run on compute clusters. They can run on any Azure Machine Learning compute cluster that already exists in the workspace. Therefore, multiple batch deployments can share the same compute infrastructure. In this example, we'll work on an Azure Machine Learning compute cluster called batch-cluster. Let's verify that the compute exists on the workspace or create it otherwise.

az ml compute create -n batch-cluster --type amlcompute --min-instances 0 --max-instances 5

Register components and models

We're going to register components, models, and transformations that we need to build our inference pipeline. We can reuse some of these assets for training routines.


In this tutorial, we'll reuse the model and the preprocessing component from an earlier training pipeline. You can see how they were created by following the example How to deploy a training pipeline with batch endpoints.

  1. Register the model to use for prediction:

    az ml model create --name heart-classifier --type mlflow_model --path model
  2. The registered model wasn't trained directly on input data. Instead, the input data was preprocessed (or transformed) before training, using a prepare component. We'll also need to register this component. Register the prepare component:

    az ml component create -f components/prepare/prepare.yml


    After registering the prepare component, you can now reference it from the workspace. For example, azureml:uci_heart_prepare@latest will get the last version of the prepare component.

  3. As part of the data transformations in the prepare component, the input data was normalized to center the predictors and limit their values in the range of [-1, 1]. The transformation parameters were captured in a scikit-learn transformation that we can also register to apply later when we have new data. Register the transformation as follows:

    az ml model create --name heart-classifier-transforms --type custom_model --path transformations
  4. We'll perform inferencing for the registered model, using another component named score that computes the predictions for a given model. We'll reference the component directly from its definition.


    Best practice would be to register the component and reference it from the pipeline. However, in this example, we're going to reference the component directly from its definition to help you see which components are reused from the training pipeline and which ones are new.

Build the pipeline

Now it's time to bind all the elements together. The inference pipeline we'll deploy has two components (steps):

  • preprocess_job: This step reads the input data and returns the prepared data and the applied transformations. The step receives two inputs:
    • data: a folder containing the input data to score
    • transformations: (optional) Path to the transformations that will be applied, if available. When provided, the transformations are read from the model that is indicated at the path. However, if the path isn't provided, then the transformations will be learned from the input data. For inferencing, though, you can't learn the transformation parameters (in this example, the normalization coefficients) from the input data because you need to use the same parameter values that were learned during training. Since this input is optional, the preprocess_job component can be used during training and scoring.
  • score_job: This step will perform inferencing on the transformed data, using the input model. Notice that the component uses an MLflow model to perform inference. Finally, the scores are written back in the same format as they were read.

The pipeline configuration is defined in the pipeline.yml file:


type: pipeline

name: batch_scoring_uci_heart
display_name: Batch Scoring for UCI heart
description: This pipeline demonstrates how to make batch inference using a model from the Heart Disease Data Set problem, where pre and post processing is required as steps. The pre and post processing steps can be components reusable from the training pipeline.

    type: uri_folder
    type: string
    default: append

    type: uri_folder
    mode: upload

    type: command
    component: azureml:uci_heart_prepare@latest
      data: ${{parent.inputs.input_data}}
        path: azureml:heart-classifier-transforms@latest
        type: custom_model
    type: command
    component: components/score/score.yml
      data: ${{}}
        path: azureml:heart-classifier@latest
        type: mlflow_model
      score_mode: ${{parent.inputs.score_mode}}
        mode: upload
        path: ${{parent.outputs.scores}}

A visualization of the pipeline is as follows:

A screenshot of the inference pipeline showing batch scoring with preprocessing.

Test the pipeline

Let's test the pipeline with some sample data. To do that, we'll create a job using the pipeline and the batch-cluster compute cluster created previously.

The following pipeline-job.yml file contains the configuration for the pipeline job:


type: pipeline

display_name: uci-classifier-score-job
description: |-
  This pipeline demonstrate how to make batch inference using a model from the Heart \
  Disease Data Set problem, where pre and post processing is required as steps. The \
  pre and post processing steps can be components reused from the training pipeline.

compute: batch-cluster
component: pipeline.yml
    type: uri_folder
  score_mode: append
    mode: upload

Create the test job:

az ml job create -f pipeline-job.yml --set inputs.input_data.path=data/unlabeled

Create a batch endpoint

  1. Provide a name for the endpoint. A batch endpoint's name needs to be unique in each region since the name is used to construct the invocation URI. To ensure uniqueness, append any trailing characters to the name specified in the following code.

  2. Configure the endpoint:

    The endpoint.yml file contains the endpoint's configuration.


    name: uci-classifier-score
    description: Batch scoring endpoint of the Heart Disease Data Set prediction task.
    auth_mode: aad_token
  3. Create the endpoint:

    az ml batch-endpoint create --name $ENDPOINT_NAME -f endpoint.yml
  4. Query the endpoint URI:

    az ml batch-endpoint show --name $ENDPOINT_NAME

Deploy the pipeline component

To deploy the pipeline component, we have to create a batch deployment. A deployment is a set of resources required for hosting the asset that does the actual work.

  1. Configure the deployment

    The deployment.yml file contains the deployment's configuration. You can check the full batch endpoint YAML schema for extra properties.


    name: uci-classifier-prepros-xgb
    endpoint_name: uci-classifier-batch
    type: pipeline
    component: pipeline.yml
        continue_on_step_failure: false
        default_compute: batch-cluster
  2. Create the deployment

    Run the following code to create a batch deployment under the batch endpoint and set it as the default deployment.

    az ml batch-deployment create --endpoint $ENDPOINT_NAME -f deployment.yml --set-default


    Notice the use of the --set-default flag to indicate that this new deployment is now the default.

  3. Your deployment is ready for use.

Test the deployment

Once the deployment is created, it's ready to receive jobs. Follow these steps to test it:

  1. Our deployment requires that we indicate one data input and one literal input.

    The inputs.yml file contains the definition for the input data asset:


        type: uri_folder
        path: data/unlabeled
        type: string
        default: append
        type: uri_folder
        mode: upload


    To learn more about how to indicate inputs, see Create jobs and input data for batch endpoints.

  2. You can invoke the default deployment as follows:

    JOB_NAME=$(az ml batch-endpoint invoke -n $ENDPOINT_NAME --f inputs.yml --query name -o tsv)
  3. You can monitor the progress of the show and stream the logs using:

    az ml job stream -n $JOB_NAME

Access job output

Once the job is completed, we can access its output. This job contains only one output named scores:

You can download the associated results using az ml job download.

az ml job download --name $JOB_NAME --output-name scores

Read the scored data:

import pandas as pd
import glob

output_files = glob.glob("named-outputs/scores/*.csv")
score = pd.concat((pd.read_csv(f) for f in output_files))

The output looks as follows:

age sex ... thal prediction
0.9338 1 ... 2 0
1.3782 1 ... 3 1
1.3782 1 ... 4 0
-1.954 1 ... 3 0

The output contains the predictions plus the data that was provided to the score component, which was preprocessed. For example, the column age has been normalized, and column thal contains original encoding values. In practice, you probably want to output the prediction only and then concat it with the original values. This work has been left to the reader.

Clean up resources

Once you're done, delete the associated resources from the workspace:

Run the following code to delete the batch endpoint and its underlying deployment. --yes is used to confirm the deletion.

az ml batch-endpoint delete -n $ENDPOINT_NAME --yes

(Optional) Delete compute, unless you plan to reuse your compute cluster with later deployments.

az ml compute delete -n batch-cluster

Next steps