How to deploy a pipeline to perform batch scoring with preprocessing

2024-08-28

APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)

In this article, you'll learn how to deploy an inference (or scoring) pipeline under a batch endpoint. The pipeline performs scoring over a registered model while also reusing a preprocessing component from when the model was trained. Reusing the same preprocessing component ensures that the same preprocessing is applied during scoring.

You'll learn to:

Create a pipeline that reuses existing components from the workspace
Deploy the pipeline to an endpoint
Consume predictions generated by the pipeline

About this example

This example shows you how to reuse preprocessing code and the parameters learned during preprocessing before you use your model for inferencing. By reusing the preprocessing code and learned parameters, we can ensure that the same transformations (such as normalization and feature encoding) that were applied to the input data during training are also applied during inferencing. The model used for inference will perform predictions on tabular data from the UCI Heart Disease Data Set.

A visualization of the pipeline is as follows:

The example in this article is based on code samples contained in the azureml-examples repository. To run the commands locally without having to copy or paste YAML and other files, use the following commands to clone the repository and go to the folder for your coding language:

Azure CLI
Python

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/cli

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/sdk/python

The files for this example are in:

cd endpoints/batch/deploy-pipelines/batch-scoring-with-preprocessing

Follow along in Jupyter notebooks

You can follow along with the Python SDK version of this example by opening the sdk-deploy-and-test.ipynb notebook in the cloned repository.

Prerequisites

An Azure subscription. If you don't have an Azure subscription, create a free account before you begin.
An Azure Machine Learning workspace. To create a workspace, see Manage Azure Machine Learning workspaces.
The following permissions in the Azure Machine Learning workspace:
- For creating or managing batch endpoints and deployments: Use an Owner, Contributor, or custom role that has been assigned the Microsoft.MachineLearningServices/workspaces/batchEndpoints/* permissions.
- For creating Azure Resource Manager deployments in the workspace resource group: Use an Owner, Contributor, or custom role that has been assigned the Microsoft.Resources/deployments/write permission in the resource group where the workspace is deployed.
The Azure Machine Learning CLI or the Azure Machine Learning SDK for Python:
- Azure CLI
- Python
Run the following command to install the Azure CLI and the ml extension for Azure Machine Learning:
```
az extension add -n ml
```
Pipeline component deployments for batch endpoints are introduced in version 2.7 of the ml extension for the Azure CLI. Use the az extension update --name ml command to get the latest version.
Run the following command to install the Azure Machine Learning SDK for Python:
```
pip install azure-ai-ml
```
The ModelBatchDeployment and PipelineComponentBatchDeployment classes are introduced in version 1.7.0 of the SDK. Use the pip install -U azure-ai-ml command to get the latest version.

Connect to your workspace

The workspace is the top-level resource for Azure Machine Learning. It provides a centralized place to work with all artifacts you create when you use Azure Machine Learning. In this section, you connect to the workspace where you perform your deployment tasks.

Azure CLI
Python

In the following command, enter your subscription ID, workspace name, resource group name, and location:

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

Import the required libraries:

from azure.ai.ml import MLClient, Input, load_component
from azure.ai.ml.entities import BatchEndpoint, ModelBatchDeployment, ModelBatchDeploymentSettings, PipelineComponentBatchDeployment, Model, AmlCompute, Data, BatchRetrySettings, CodeConfiguration, Environment, Data
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
from azure.ai.ml.dsl import pipeline
from azure.identity import DefaultAzureCredential

Configure the workspace details and get a handle to the workspace:

In the following command, enter your subscription ID, resource group name, and workspace name:

subscription_id = "<subscription>"
resource_group = "<resource-group>"
workspace = "<workspace>"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

Create the inference pipeline

In this section, we'll create all the assets required for our inference pipeline. We'll begin by creating an environment that includes necessary libraries for the pipeline's components. Next, we'll create a compute cluster on which the batch deployment will run. Afterwards, we'll register the components, models, and transformations we need to build our inference pipeline. Finally, we'll build and test the pipeline.

Create the environment

The components in this example will use an environment with the XGBoost and scikit-learn libraries. The environment/conda.yml file contains the environment's configuration:

environment/conda.yml

channels:
- conda-forge
dependencies:
- python=3.8.5
- pip
- pip:
  - mlflow
  - azureml-mlflow
  - datasets
  - jobtools
  - cloudpickle==1.6.0
  - dask==2023.2.0
  - scikit-learn==1.1.2
  - xgboost==1.3.3
name: mlflow-env

Create the environment as follows:

Define the environment:

Azure CLI
Python

environment/xgboost-sklearn-py38.yml

$schema: https://azuremlschemas.azureedge.net/latest/environment.schema.json
name: xgboost-sklearn-py38
image: mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest
conda_file: conda.yml
description: An environment for models built with XGBoost and Scikit-learn.

environment = Environment(
    name="xgboost-sklearn-py38",
    description="An environment for models built with XGBoost and Scikit-learn.",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04:latest",
    conda_file="environment/conda.yml",
)

Create the environment:

Azure CLI
Python

az ml environment create -f environment/xgboost-sklearn-py38.yml

try:
    ml_client.environments.create_or_update(environment)
except ResourceExistsError:
    pass

Create a compute cluster

Batch endpoints and deployments run on compute clusters. They can run on any Azure Machine Learning compute cluster that already exists in the workspace. Therefore, multiple batch deployments can share the same compute infrastructure. In this example, we'll work on an Azure Machine Learning compute cluster called batch-cluster. Let's verify that the compute exists on the workspace or create it otherwise.

Azure CLI
Python

az ml compute create -n batch-cluster --type amlcompute --min-instances 0 --max-instances 5

compute_name = "batch-cluster"
if not any(filter(lambda m: m.name == compute_name, ml_client.compute.list())):
    compute_cluster = AmlCompute(
        name=compute_name,
        description="Batch endpoints compute cluster",
        min_instances=0,
        max_instances=5,
    )
    ml_client.begin_create_or_update(compute_cluster).result()

Register components and models

We're going to register components, models, and transformations that we need to build our inference pipeline. We can reuse some of these assets for training routines.

Tip

In this tutorial, we'll reuse the model and the preprocessing component from an earlier training pipeline. You can see how they were created by following the example How to deploy a training pipeline with batch endpoints.

Azure CLI
Python

az ml model create --name heart-classifier --type mlflow_model --path model

model_name = "heart-classifier"
model_local_path = "model"

model = ml_client.models.create_or_update(
    Model(name=model_name, path=model_local_path, type=AssetTypes.MLFLOW_MODEL)
)

The registered model wasn't trained directly on input data. Instead, the input data was preprocessed (or transformed) before training, using a prepare component. We'll also need to register this component. Register the prepare component:
- Azure CLI
- Python
```
az ml component create -f components/prepare/prepare.yml
```
```
prepare_data = load_component(source="components/prepare/prepare.yml")

ml_client.components.create_or_update(prepare_data)
```
Tip

After registering the prepare component, you can now reference it from the workspace. For example, azureml:uci_heart_prepare@latest will get the last version of the prepare component.

As part of the data transformations in the prepare component, the input data was normalized to center the predictors and limit their values in the range of [-1, 1]. The transformation parameters were captured in a scikit-learn transformation that we can also register to apply later when we have new data. Register the transformation as follows:

Azure CLI
Python

az ml model create --name heart-classifier-transforms --type custom_model --path transformations

transformation_name = "heart-classifier-transforms"
transformation_local_path = "transformations"

transformations = ml_client.models.create_or_update(
    Model(
        name=transformation_name,
        path=transformation_local_path,
        type=AssetTypes.CUSTOM_MODEL,
    )
)

We'll perform inferencing for the registered model, using another component named score that computes the predictions for a given model. We'll reference the component directly from its definition.

Tip

Best practice would be to register the component and reference it from the pipeline. However, in this example, we're going to reference the component directly from its definition to help you see which components are reused from the training pipeline and which ones are new.

Build the pipeline

Now it's time to bind all the elements together. The inference pipeline we'll deploy has two components (steps):

preprocess_job: This step reads the input data and returns the prepared data and the applied transformations. The step receives two inputs:
- data: a folder containing the input data to score
- transformations: (optional) Path to the transformations that will be applied, if available. When provided, the transformations are read from the model that is indicated at the path. However, if the path isn't provided, then the transformations will be learned from the input data. For inferencing, though, you can't learn the transformation parameters (in this example, the normalization coefficients) from the input data because you need to use the same parameter values that were learned during training. Since this input is optional, the preprocess_job component can be used during training and scoring.
score_job: This step will perform inferencing on the transformed data, using the input model. Notice that the component uses an MLflow model to perform inference. Finally, the scores are written back in the same format as they were read.

Azure CLI
Python

The pipeline configuration is defined in the pipeline.yml file:

pipeline.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponent.schema.json
type: pipeline

name: batch_scoring_uci_heart
display_name: Batch Scoring for UCI heart
description: This pipeline demonstrates how to make batch inference using a model from the Heart Disease Data Set problem, where pre and post processing is required as steps. The pre and post processing steps can be components reusable from the training pipeline.

inputs:
  input_data:
    type: uri_folder
  score_mode:
    type: string
    default: append

outputs: 
  scores:
    type: uri_folder
    mode: upload

jobs:
  preprocess_job:
    type: command
    component: azureml:uci_heart_prepare@latest
    inputs:
      data: ${{parent.inputs.input_data}}
      transformations: 
        path: azureml:heart-classifier-transforms@latest
        type: custom_model
    outputs:
      prepared_data:
  
  score_job:
    type: command
    component: components/score/score.yml
    inputs:
      data: ${{parent.jobs.preprocess_job.outputs.prepared_data}}
      model:
        path: azureml:heart-classifier@latest
        type: mlflow_model
      score_mode: ${{parent.inputs.score_mode}}
    outputs:
      scores: 
        mode: upload
        path: ${{parent.outputs.scores}}

prepare_data = ml_client.components.get("uci_heart_prepare", label="latest")
score_data = load_component(source="components/score/score.yml")

Let's build the pipeline in a function:

@pipeline()
def uci_heart_classifier_scorer(
    input_data: Input(type=AssetTypes.URI_FOLDER), score_mode: str
):
    """This pipeline demonstrates how to make batch inference using a model from the Heart Disease Data Set problem, where pre and post processing is required as steps. The pre and post processing steps can be components reusable from the training pipeline."""
    prepared_data = prepare_data(
        data=input_data,
        transformations=Input(type=AssetTypes.CUSTOM_MODEL, path=transformations.id),
    )
    scored_data = score_data(
        data=prepared_data.outputs.prepared_data,
        model=Input(type=AssetTypes.MLFLOW_MODEL, path=model.id),
        score_mode=score_mode,
    )

    return {"scores": scored_data.outputs.scores}

A visualization of the pipeline is as follows:

Test the pipeline

Let's test the pipeline with some sample data. To do that, we'll create a job using the pipeline and the batch-cluster compute cluster created previously.

Azure CLI
Python

The following pipeline-job.yml file contains the configuration for the pipeline job:

pipeline-job.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline

display_name: uci-classifier-score-job
description: |-
  This pipeline demonstrate how to make batch inference using a model from the Heart \
  Disease Data Set problem, where pre and post processing is required as steps. The \
  pre and post processing steps can be components reused from the training pipeline.

compute: batch-cluster
component: pipeline.yml
inputs:
  input_data:
    type: uri_folder
  score_mode: append
outputs: 
  scores:
    mode: upload

pipeline_job = uci_heart_classifier_scorer(
    input_data=Input(type="uri_folder", path="data/unlabeled/"), score_mode="append"
)

Now, we'll configure some run settings to run the test:

pipeline_job.settings.default_datastore = "workspaceblobstore"
pipeline_job.settings.default_compute = "batch-cluster"

Create the test job:

Azure CLI
Python

az ml job create -f pipeline-job.yml --set inputs.input_data.path=data/unlabeled

pipeline_job_run = ml_client.jobs.create_or_update(
    pipeline_job, experiment_name="uci-heart-score-pipeline"
)
pipeline_job_run

Create a batch endpoint

Provide a name for the endpoint. A batch endpoint's name needs to be unique in each region since the name is used to construct the invocation URI. To ensure uniqueness, append any trailing characters to the name specified in the following code.
- Azure CLI
- Python
```
ENDPOINT_NAME="uci-classifier-score"
```
```
endpoint_name = "uci-classifier-score"
```

Configure the endpoint:

Azure CLI
Python

The endpoint.yml file contains the endpoint's configuration.

endpoint.yml

$schema: https://azuremlschemas.azureedge.net/latest/batchEndpoint.schema.json
name: uci-classifier-score
description: Batch scoring endpoint of the Heart Disease Data Set prediction task.
auth_mode: aad_token

endpoint = BatchEndpoint(
    name=endpoint_name,
    description="Batch scoring endpoint of the Heart Disease Data Set prediction task",
)

Create the endpoint:

Azure CLI
Python

az ml batch-endpoint create --name $ENDPOINT_NAME -f endpoint.yml

ml_client.batch_endpoints.begin_create_or_update(endpoint).result()

Query the endpoint URI:

Azure CLI
Python

az ml batch-endpoint show --name $ENDPOINT_NAME

endpoint = ml_client.batch_endpoints.get(name=endpoint_name)
print(endpoint)

Deploy the pipeline component

To deploy the pipeline component, we have to create a batch deployment. A deployment is a set of resources required for hosting the asset that does the actual work.

Configure the deployment

Azure CLI
Python

The deployment.yml file contains the deployment's configuration. You can check the full batch endpoint YAML schema for extra properties.

deployment.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponentBatchDeployment.schema.json
name: uci-classifier-prepros-xgb
endpoint_name: uci-classifier-batch
type: pipeline
component: pipeline.yml
settings:
    continue_on_step_failure: false
    default_compute: batch-cluster

Our pipeline is defined in a function. To transform it to a component, you'll use the component property from it. Pipeline components are reusable compute graphs that can be included in batch deployments or used to compose more complex pipelines.

pipeline_component = ml_client.components.create_or_update(
    uci_heart_classifier_scorer().component
)

Now we can define the deployment:

deployment = PipelineComponentBatchDeployment(
    name="uci-classifier-prepros-xgb",
    description="A sample deployment with pre and post processing done before and after inference.",
    endpoint_name=endpoint.name,
    component=pipeline_component,
    settings={"continue_on_step_failure": False, "default_compute": compute_name},
)

Create the deployment
- Azure CLI
- Python
Run the following code to create a batch deployment under the batch endpoint and set it as the default deployment.
```
az ml batch-deployment create --endpoint $ENDPOINT_NAME -f deployment.yml --set-default
```
Tip

Notice the use of the --set-default flag to indicate that this new deployment is now the default.
This command will start the deployment creation and return a confirmation response while the deployment creation continues.
```
ml_client.batch_deployments.begin_create_or_update(deployment).result()
```
Once created, let's configure this new deployment as the default one:
```
endpoint = ml_client.batch_endpoints.get(endpoint_name)
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()
```
Your deployment is ready for use.

Test the deployment

Once the deployment is created, it's ready to receive jobs. Follow these steps to test it:

Our deployment requires that we indicate one data input and one literal input.
- Azure CLI
- Python
The inputs.yml file contains the definition for the input data asset:

inputs.yml
```
inputs:
  input_data:
    type: uri_folder
    path: data/unlabeled
  score_mode:
    type: string
    default: append
outputs:
  scores:
    type: uri_folder
    mode: upload
```
The input data asset definition:
```
input_data = Input(type="uri_folder", path="data/unlabeled/")
score_mode = Input(type="string", default="append")
```
Tip

To learn more about how to indicate inputs, see Create jobs and input data for batch endpoints.
You can invoke the default deployment as follows:
- Azure CLI
- Python
```
JOB_NAME=$(az ml batch-endpoint invoke -n $ENDPOINT_NAME --f inputs.yml --query name -o tsv)
```
Tip

What's the difference between the inputs and input parameter when you invoke an endpoint?

In general, you can use a dictionary inputs = {} parameter with the invoke method to provide an arbitrary number of required inputs to a batch endpoint that contains a model deployment or a pipeline deployment.

For a model deployment, you can use the input parameter as a shorter way to specify the input data location for the deployment. This approach works because a model deployment always takes only one data input.
```
job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint.name,
    inputs={"input_data": input_data, "score_mode": score_mode},
)
```
You can monitor the progress of the show and stream the logs using:
- Azure CLI
- Python
```
az ml job stream -n $JOB_NAME
```
```
ml_client.jobs.get(job.name)
```
To wait for the job to finish, run the following code:
```
ml_client.jobs.stream(name=job.name)
```

Access job output

Once the job is completed, we can access its output. This job contains only one output named scores:

Azure CLI
Python

You can download the associated results using az ml job download.

az ml job download --name $JOB_NAME --output-name scores

Download the result:

ml_client.jobs.download(name=job.name, download_path=".", output_name="scores")

Read the scored data:

import pandas as pd
import glob

output_files = glob.glob("named-outputs/scores/*.csv")
score = pd.concat((pd.read_csv(f) for f in output_files))
score

The output looks as follows:

age	sex	...	thal	prediction
0.9338	1	...	2	0
1.3782	1	...	3	1
1.3782	1	...	4	0
-1.954	1	...	3	0

The output contains the predictions plus the data that was provided to the score component, which was preprocessed. For example, the column age has been normalized, and column thal contains original encoding values. In practice, you probably want to output the prediction only and then concat it with the original values. This work has been left to the reader.

Clean up resources

Once you're done, delete the associated resources from the workspace:

Azure CLI
Python

Run the following code to delete the batch endpoint and its underlying deployment. --yes is used to confirm the deletion.

az ml batch-endpoint delete -n $ENDPOINT_NAME --yes

Delete the endpoint:

ml_client.batch_endpoints.begin_delete(endpoint_name)

(Optional) Delete compute, unless you plan to reuse your compute cluster with later deployments.

Azure CLI
Python

az ml compute delete -n batch-cluster

ml_client.compute.begin_delete(name="batch-cluster")