Επεξεργασία

Κοινή χρήση μέσω


Set up MLOps with Azure DevOps

APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)

Azure Machine Learning allows you to integrate with Azure DevOps pipeline to automate the machine learning lifecycle. Some of the operations you can automate are:

  • Deployment of Azure Machine Learning infrastructure
  • Data preparation (extract, transform, load operations)
  • Training machine learning models with on-demand scale-out and scale-up
  • Deployment of machine learning models as public or private web services
  • Monitoring deployed machine learning models (such as for performance analysis)

In this article, you learn about using Azure Machine Learning to set up an end-to-end MLOps pipeline that runs a linear regression to predict taxi fares in NYC. The pipeline is made up of components, each serving different functions, which can be registered with the workspace, versioned, and reused with various inputs and outputs. you're going to be using the recommended Azure architecture for MLOps and AzureMLOps (v2) solution accelerator to quickly setup an MLOps project in Azure Machine Learning.

Tip

We recommend you understand some of the recommended Azure architectures for MLOps before implementing any solution. You'll need to pick the best architecture for your given Machine learning project.

Prerequisites

Note

Git version 2.27 or newer is required. For more information on installing the Git command, see https://git-scm.com/downloads and select your operating system

Important

The CLI commands in this article were tested using Bash. If you use a different shell, you may encounter errors.

Set up authentication with Azure and DevOps

Before you can set up an MLOps project with Azure Machine Learning, you need to set up authentication for Azure DevOps.

Create service principal

For the use of the demo, the creation of one or two service principles is required, depending on how many environments, you want to work on (Dev or Prod or Both). These principles can be created using one of the following methods:

  1. Launch the Azure Cloud Shell.

    Tip

    The first time you've launched the Cloud Shell, you'll be prompted to create a storage account for the Cloud Shell.

  2. If prompted, choose Bash as the environment used in the Cloud Shell. You can also change environments in the drop-down on the top navigation bar

    Screenshot of the cloud shell environment dropdown.

  3. Copy the following bash commands to your computer and update the projectName, subscriptionId, and environment variables with the values for your project. If you're creating both a Dev and Prod environment, you'll need to run this script once for each environment, creating a service principal for each. This command will also grant the Contributor role to the service principal in the subscription provided. This is required for Azure DevOps to properly use resources in that subscription.

    projectName="<your project name>"
    roleName="Contributor"
    subscriptionId="<subscription Id>"
    environment="<Dev|Prod>" #First letter should be capitalized
    servicePrincipalName="Azure-ARM-${environment}-${projectName}"
    # Verify the ID of the active subscription
    echo "Using subscription ID $subscriptionID"
    echo "Creating SP for RBAC with name $servicePrincipalName, with role $roleName and in scopes     /subscriptions/$subscriptionId"
    az ad sp create-for-rbac --name $servicePrincipalName --role $roleName --scopes /subscriptions/$subscriptionId
    echo "Please ensure that the information created here is properly save for future use."
    
  4. Copy your edited commands into the Azure Shell and run them (Ctrl + Shift + v).

  5. After running these commands, you'll be presented with information related to the service principal. Save this information to a safe location, it will be use later in the demo to configure Azure DevOps.

    {
       "appId": "<application id>",
       "displayName": "Azure-ARM-dev-Sample_Project_Name",
       "password": "<password>",
       "tenant": "<tenant id>"
    }
    
  6. Repeat Step 3 if you're creating service principals for Dev and Prod environments. For this demo, we'll be creating only one environment, which is Prod.

  7. Close the Cloud Shell once the service principals are created.

Set up Azure DevOps

  1. Navigate to Azure DevOps.

  2. Select create a new project (Name the project mlopsv2 for this tutorial).

    Screenshot of ADO Project.

  3. In the project under Project Settings (at the bottom left of the project page) select Service Connections.

  4. Select Create Service Connection.

    Screenshot of ADO New Service connection button.

  5. Select Azure Resource Manager, select Next, select Service principal (manual), select Next and select the Scope Level Subscription.

    • Subscription Name - Use the name of the subscription where your service principal is stored.
    • Subscription Id - Use the subscriptionId you used in Step 1 input as the Subscription ID
    • Service Principal Id - Use the appId from Step 1 output as the Service Principal ID
    • Service principal key - Use the password from Step 1 output as the Service Principal Key
    • Tenant ID - Use the tenant from Step 1 output as the Tenant ID
  6. Name the service connection Azure-ARM-Prod.

  7. Select Grant access permission to all pipelines, then select Verify and Save.

The Azure DevOps setup is successfully finished.

Set up source repository with Azure DevOps

  1. Open the project you created in Azure DevOps

  2. Open the Repos section and select Import Repository

    Screenshot of Azure DevOps import repo first time.

  3. Enter https://github.com/Azure/mlops-v2-ado-demo into the Clone URL field. Select import at the bottom of the page

    Screenshot of Azure DevOps import MLOps demo repo.

  4. Open the Project settings at the bottom of the left hand navigation pane

  5. Under the Repos section, select Repositories. Select the repository you created in previous step Select the Security tab

  6. Under the User permissions section, select the mlopsv2 Build Service user. Change the permission Contribute permission to Allow and the Create branch permission to Allow. Screenshot of Azure DevOps permissions.

  7. Open the Pipelines section in the left hand navigation pane and select on the 3 vertical dots next to the Create Pipelines button. Select Manage Security

    Screenshot of Pipeline security.

  8. Select the mlopsv2 Build Service account for your project under the Users section. Change the permission Edit build pipeline to Allow

    Screenshot of Add security.

Note

This finishes the prerequisite section and the deployment of the solution accelerator can happen accordingly.

Deploying infrastructure via Azure DevOps

This step deploys the training pipeline to the Azure Machine Learning workspace created in the previous steps.

Tip

Make sure you understand the Architectural Patterns of the solution accelerator before you checkout the MLOps v2 repo and deploy the infrastructure. In examples you'll use the classical ML project type.

Run Azure infrastructure pipeline

  1. Go to your repository, mlops-v2-ado-demo, and select the config-infra-prod.yml file.

    Important

    Make sure you've selected the main branch of the repo.

    Screenshot of Repo in ADO.

    This config file uses the namespace and postfix values the names of the artifacts to ensure uniqueness. Update the following section in the config to your liking.

     namespace: [5 max random new letters]
     postfix: [4 max random new digits]
     location: eastus
    

    Note

    If you are running a Deep Learning workload such as CV or NLP, ensure your GPU compute is available in your deployment zone.

  2. Select Commit and push code to get these values into the pipeline.

  3. Go to Pipelines section

    Screenshot of ADO Pipelines.

  4. Select Create Pipeline.

  5. Select Azure Repos Git.

    Screenshot of ADO Where's your code.

  6. Select the repository that you cloned in from the previous section mlops-v2-ado-demo

  7. Select Existing Azure Pipelines YAML file

    Screenshot of Azure DevOps Pipeline page on configure step.

  8. Select the main branch and choose mlops/devops-pipelines/cli-ado-deploy-infra.yml, then select Continue.

  9. Run the pipeline; it will take a few minutes to finish. The pipeline should create the following artifacts:

    • Resource Group for your Workspace including Storage Account, Container Registry, Application Insights, Keyvault and the Azure Machine Learning Workspace itself.
    • In the workspace, there's also a compute cluster created.
  10. Now the infrastructure for your MLOps project is deployed. Screenshot of ADO Infra Pipeline screen.

    Note

    The Unable move and reuse existing repository to required location warnings may be ignored.

Sample Training and Deployment Scenario

The solution accelerator includes code and data for a sample end-to-end machine learning pipeline which runs a linear regression to predict taxi fares in NYC. The pipeline is made up of components, each serving different functions, which can be registered with the workspace, versioned, and reused with various inputs and outputs. Sample pipelines and workflows for the Computer Vision and NLP scenarios will have different steps and deployment steps.

This training pipeline contains the following steps:

Prepare Data

  • This component takes multiple taxi datasets (yellow and green) and merges/filters the data, and prepare the train/val and evaluation datasets.
  • Input: Local data under ./data/ (multiple .csv files)
  • Output: Single prepared dataset (.csv) and train/val/test datasets.

Train Model

  • This component trains a Linear Regressor with the training set.
  • Input: Training dataset
  • Output: Trained model (pickle format)

Evaluate Model

  • This component uses the trained model to predict taxi fares on the test set.
  • Input: ML model and Test dataset
  • Output: Performance of model and a deploy flag whether to deploy or not.
  • This component compares the performance of the model with all previous deployed models on the new test dataset and decides whether to promote or not model into production. Promoting model into production happens by registering the model in AML workspace.

Register Model

  • This component scores the model based on how accurate the predictions are in the test set.
  • Input: Trained model and the deploy flag.
  • Output: Registered model in Azure Machine Learning.

Deploying model training pipeline

  1. Go to ADO pipelines

    Screenshot of ADO Pipelines.

  2. Select New Pipeline.

    Screenshot of ADO New Pipeline button.

  3. Select Azure Repos Git.

    Screenshot of ADO Where's your code.

  4. Select the repository that you cloned in from the previous section mlopsv2

  5. Select Existing Azure Pipelines YAML file

    Screenshot of ADO Pipeline page on configure step.

  6. Select main as a branch and choose /mlops/devops-pipelines/deploy-model-training-pipeline.yml, then select Continue.

  7. Save and Run the pipeline

Note

At this point, the infrastructure is configured and the Prototyping Loop of the MLOps Architecture is deployed. you're ready to move to our trained model to production.

Deploying the Trained model

This scenario includes prebuilt workflows for two approaches to deploying a trained model, batch scoring or a deploying a model to an endpoint for real-time scoring. You may run either or both of these workflows to test the performance of the model in your Azure ML workspace. IN this example we will be using real-time scoring.

Deploy ML model endpoint

  1. Go to ADO pipelines

    Screenshot of ADO Pipelines.

  2. Select New Pipeline.

    Screenshot of ADO New Pipeline button for endpoint.

  3. Select Azure Repos Git.

    Screenshot of ADO Where's your code.

  4. Select the repository that you cloned in from the previous section mlopsv2

  5. Select Existing Azure Pipelines YAML file

    Screenshot of Azure DevOps Pipeline page on configure step.

  6. Select main as a branch and choose Managed Online Endpoint /mlops/devops-pipelines/deploy-online-endpoint-pipeline.yml then select Continue.

  7. Online endpoint names need to be unique, so change taxi-online-$(namespace)$(postfix)$(environment) to another unique name and then select Run. No need to change the default if it doesn't fail.

    Screenshot of Azure DevOps batch deploy script.

    Important

    If the run fails due to an existing online endpoint name, recreate the pipeline as described previously and change [your endpoint-name] to [your endpoint-name (random number)]

  8. When the run completes, you'll see output similar to the following image:

    Screenshot of ADO Pipeline batch run result page.

  9. To test this deployment, go to the Endpoints tab in your AzureML workspace, select the endpoint and click the Test Tab. You can use the sample input data located in the cloned repo at /data/taxi-request.json to test the endpoint.

Clean up resources

  1. If you're not going to continue to use your pipeline, delete your Azure DevOps project.
  2. In Azure portal, delete your resource group and Azure Machine Learning instance.

Next steps