How to deploy pipelines with batch endpoints

Raksts
08/28/2024

APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)

You can deploy pipeline components under a batch endpoint, providing a convenient way to operationalize them in Azure Machine Learning. In this article, you'll learn how to create a batch deployment that contains a simple pipeline. You'll learn to:

Create and register a pipeline component
Create a batch endpoint and deploy a pipeline component
Test the deployment

About this example

In this example, we're going to deploy a pipeline component consisting of a simple command job that prints "hello world!". This component requires no inputs or outputs and is the simplest pipeline deployment scenario.

The example in this article is based on code samples contained in the azureml-examples repository. To run the commands locally without having to copy or paste YAML and other files, use the following commands to clone the repository and go to the folder for your coding language:

Azure CLI
Python

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/cli

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/sdk/python

The files for this example are in:

cd endpoints/batch/deploy-pipelines/hello-batch

Follow along in Jupyter notebooks

You can follow along with the Python SDK version of this example by opening the sdk-deploy-and-test.ipynb notebook in the cloned repository.

Prerequisites

An Azure subscription. If you don't have an Azure subscription, create a free account before you begin.
An Azure Machine Learning workspace. To create a workspace, see Manage Azure Machine Learning workspaces.
The following permissions in the Azure Machine Learning workspace:
- For creating or managing batch endpoints and deployments: Use an Owner, Contributor, or custom role that has been assigned the Microsoft.MachineLearningServices/workspaces/batchEndpoints/* permissions.
- For creating Azure Resource Manager deployments in the workspace resource group: Use an Owner, Contributor, or custom role that has been assigned the Microsoft.Resources/deployments/write permission in the resource group where the workspace is deployed.
The Azure Machine Learning CLI or the Azure Machine Learning SDK for Python:
- Azure CLI
- Python
Run the following command to install the Azure CLI and the ml extension for Azure Machine Learning:
```
az extension add -n ml
```
Pipeline component deployments for batch endpoints are introduced in version 2.7 of the ml extension for the Azure CLI. Use the az extension update --name ml command to get the latest version.
Run the following command to install the Azure Machine Learning SDK for Python:
```
pip install azure-ai-ml
```
The ModelBatchDeployment and PipelineComponentBatchDeployment classes are introduced in version 1.7.0 of the SDK. Use the pip install -U azure-ai-ml command to get the latest version.

Connect to your workspace

The workspace is the top-level resource for Azure Machine Learning. It provides a centralized place to work with all artifacts you create when you use Azure Machine Learning. In this section, you connect to the workspace where you perform your deployment tasks.

Azure CLI
Python

In the following command, enter your subscription ID, workspace name, resource group name, and location:

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

Import the required libraries:

from azure.ai.ml import MLClient, Input, load_component
from azure.ai.ml.entities import BatchEndpoint, ModelBatchDeployment, ModelBatchDeploymentSettings, PipelineComponentBatchDeployment, Model, AmlCompute, Data, BatchRetrySettings, CodeConfiguration, Environment, Data
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
from azure.ai.ml.dsl import pipeline
from azure.identity import DefaultAzureCredential

Configure the workspace details and get a handle to the workspace:

In the following command, enter your subscription ID, resource group name, and workspace name:

subscription_id = "<subscription>"
resource_group = "<resource-group>"
workspace = "<workspace>"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

Create the pipeline component

Batch endpoints can deploy either models or pipeline components. Pipeline components are reusable, and you can streamline your MLOps practice by using shared registries to move these components from one workspace to another.

The pipeline component in this example contains one single step that only prints a "hello world" message in the logs. It doesn't require any inputs or outputs.

The hello-component/hello.yml file contains the configuration for the pipeline component:

hello-component/hello.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponent.schema.json
name: hello_batch
display_name: Hello Batch component
version: 1
type: pipeline
jobs:
  main_job:
    type: command
    component:
      code: src
      environment: azureml://registries/azureml/environments/sklearn-1.5/labels/latest
      command: >-
        python hello.py

Azure CLI
Python

az ml component create -f hello-component/hello.yml

hello_batch = load_component(source="hello-component/hello.yml")
hello_batch_registered = ml_client.components.create_or_update(hello_batch)

Create a batch endpoint

Provide a name for the endpoint. A batch endpoint's name needs to be unique in each region since the name is used to construct the invocation URI. To ensure uniqueness, append any trailing characters to the name specified in the following code.
- Azure CLI
- Python
```
ENDPOINT_NAME="hello-batch"
```
```
endpoint_name = "hello-batch"
```

Configure the endpoint:

Azure CLI
Python

The endpoint.yml file contains the endpoint's configuration.

endpoint.yml

$schema: https://azuremlschemas.azureedge.net/latest/batchEndpoint.schema.json
name: hello-batch
description: A hello world endpoint for component deployments.
auth_mode: aad_token

endpoint = BatchEndpoint(
    name=endpoint_name,
    description="A hello world endpoint for component deployments",
)

Create the endpoint:

Azure CLI
Python

az ml batch-endpoint create --name $ENDPOINT_NAME  -f endpoint.yml

ml_client.batch_endpoints.begin_create_or_update(endpoint).result()

Query the endpoint URI:

Azure CLI
Python

az ml batch-endpoint show --name $ENDPOINT_NAME

endpoint = ml_client.batch_endpoints.get(name=endpoint_name)
print(endpoint)

Deploy the pipeline component

To deploy the pipeline component, we have to create a batch deployment. A deployment is a set of resources required for hosting the asset that does the actual work.

Create a compute cluster. Batch endpoints and deployments run on compute clusters. They can run on any Azure Machine Learning compute cluster that already exists in the workspace. Therefore, multiple batch deployments can share the same compute infrastructure. In this example, we'll work on an Azure Machine Learning compute cluster called batch-cluster. Let's verify that the compute exists on the workspace or create it otherwise.
- Azure CLI
- Python
```
az ml compute create -n batch-cluster --type amlcompute --min-instances 0 --max-instances 5
```
```
compute_name = "batch-cluster"
if not any(filter(lambda m: m.name == compute_name, ml_client.compute.list())):
    compute_cluster = AmlCompute(
        name=compute_name,
        description="Batch endpoints compute cluster",
        min_instances=0,
        max_instances=5,
    )
    ml_client.begin_create_or_update(compute_cluster).result()
```

Configure the deployment:

Azure CLI
Python

The deployment.yml file contains the deployment's configuration. You can check the full batch endpoint YAML schema for extra properties.

deployment.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponentBatchDeployment.schema.json
name: hello-batch-dpl
endpoint_name: hello-pipeline-batch
type: pipeline
component: azureml:hello_batch@latest
settings:
    default_compute: batch-cluster

deployment = PipelineComponentBatchDeployment(
    name="hello-batch-dpl",
    description="A hello world deployment with a single step.",
    endpoint_name=endpoint.name,
    component=hello_batch,
    settings={"continue_on_step_failure": False, "default_compute": compute_name},
)

Create the deployment:
- Azure CLI
- Python
Run the following code to create a batch deployment under the batch endpoint and set it as the default deployment.
```
az ml batch-deployment create --endpoint $ENDPOINT_NAME -f deployment.yml --set-default
```
Tip

Notice the use of the --set-default flag to indicate that this new deployment is now the default.
This command will start the deployment creation and return a confirmation response while the deployment creation continues.
```
ml_client.batch_deployments.begin_create_or_update(deployment).result()
```
Once created, let's configure this new deployment as the default one:
```
endpoint = ml_client.batch_endpoints.get(endpoint_name)
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()
```
Your deployment is ready for use.

Test the deployment

Once the deployment is created, it's ready to receive jobs. You can invoke the default deployment as follows:

Azure CLI
Python

JOB_NAME=$(az ml batch-endpoint invoke -n $ENDPOINT_NAME --query name -o tsv)

job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint.name,
)

Tip

In this example, the pipeline doesn't have inputs or outputs. However, if the pipeline component requires some, they can be indicated at invocation time. To learn about how to indicate inputs and outputs, see Create jobs and input data for batch endpoints or see the tutorial How to deploy a pipeline to perform batch scoring with preprocessing (preview).

You can monitor the progress of the show and stream the logs using:

Azure CLI
Python

az ml job stream -n $JOB_NAME

ml_client.jobs.get(job.name)

To wait for the job to finish, run the following code:

ml_client.jobs.stream(name=job.name)

Clean up resources

Once you're done, delete the associated resources from the workspace:

Azure CLI
Python

Run the following code to delete the batch endpoint and its underlying deployment. --yes is used to confirm the deletion.

az ml batch-endpoint delete -n $ENDPOINT_NAME --yes

Delete the endpoint:

ml_client.batch_endpoints.begin_delete(endpoint_name).result()

(Optional) Delete compute, unless you plan to reuse your compute cluster with later deployments.

Azure CLI
Python

az ml compute delete -n batch-cluster

ml_client.compute.begin_delete(name="batch-cluster")

Kopīgot, izmantojot