Редагувати

Поділитися через


Manage models registry in Azure Machine Learning with MLflow

Azure Machine Learning supports MLflow for model management when connected to a workspace. This approach is a convenient way to support the entire model lifecycle for users familiar with the MLFlow client.

This article describes capabilities for managing a model registry with MLflow and how this method compares with other management options.

Prerequisites

  • Install the MLflow SDK mlflow package and the Azure Machine Learning azureml-mlflow plugin for MLflow as follows:

    pip install mlflow azureml-mlflow
    

    Tip

    You can use the mlflow-skinny package, which is a lightweight MLflow package without SQL storage, server, UI, or data science dependencies. This package is recommended for users who primarily need the MLflow tracking and logging capabilities without importing the full suite of features, including deployments.

  • Create an Azure Machine Learning workspace. To create a workspace, see Create resources you need to get started. Review the access permissions you need to perform your MLflow operations in your workspace.

  • To do remote tracking, or track experiments running outside Azure Machine Learning, configure MLflow to point to the tracking URI of your Azure Machine Learning workspace. For more information on how to connect MLflow to your workspace, see Configure MLflow for Azure Machine Learning.

  • The procedures in this article use a client object to refer to the MLflow client.

    Some operations can be executed directly by using the MLflow fluent API, mlflow.<method>. Other operations require an MLflow client to enable communication with Machine Learning in the MLflow protocol. The following code creates an MlflowClient object:

    import mlflow
    
    client = mlflow.tracking.MlflowClient()
    

Limitations

  • Azure Machine Learning doesn't support renaming models.

  • Machine Learning doesn't support deleting the entire model container.

  • Organizational registries aren't supported for model management with MLflow.

  • Model deployment from a specific model stage isn't currently supported in Machine Learning.

  • Cross-workspace operations aren't currently supported in Machine Learning.

Register new models

The models registry offers a convenient and centralized way to manage models in a workspace. Each workspace has its own independent models registry. The following sections demonstrate two ways you can register models in the registry by using the MLflow SDK.

Create models from existing run

If you have an MLflow model logged inside a run, and you want to register it in a registry, use the run ID and path where the model is logged. You can query for this information by following the instructions in Manage experiments and runs with MLflow.

mlflow.register_model(f"runs:/{run_id}/{artifact_path}", model_name)

Note

Models can only be registered to the registry in the same workspace where the run was tracked. Cross-workspace operations aren't currently supported in Azure Machine Learning.

Tip

Register models from runs or by using the mlflow.<flavor>.log_model method from inside the run. This approach preserves lineage from the job that generated the asset.

Create models from assets

If you have a folder with an MLModel MLflow model, you can register it directly. There's no need for the model to be always in the context of a run. For this approach, you can use the URI schema file://path/to/model to register MLflow models stored in the local file system.

The following code creates a simple model by using the scikit-learn package and saves the model in MLflow format in local storage:

from sklearn import linear_model

reg = linear_model.LinearRegression()
reg.fit([[0, 0], [1, 1], [2, 2]], [0, 1, 2])

mlflow.sklearn.save_model(reg, "./regressor")

Tip

The save_model() method works in the same way as the log_model() method. While the log_model() method saves the model inside an active run, the save_model() method uses the local file system to save the model.

The following code registers the model by using the local path:

import os

model_local_path = os.path.abspath("./regressor")
mlflow.register_model(f"file://{model_local_path}", "local-model-test")

Query model registries

You can use the MLflow SDK to query and search for models registered in the registry. The following sections demonstrate two ways to query for a model.

Query all models in registry

You can query all registered models in the registry by using the MLflow client.

The following code prints the names of all models in the registry:

for model in client.search_registered_models():
    print(f"{model.name}")

Use the order_by method to organize the output by a specific property, such as name, version, creation_timestamp, or last_updated_timestamp:

client.search_registered_models(order_by=["name ASC"])

Note

For MLflow versions earlier than 2.0, use the MlflowClient.list_registered_models() method instead.

Get specific model versions

The search_registered_models() method retrieves the model object, which contains all the model versions. To get the last registered model version for a given model, you can use the get_registered_model() method:

client.get_registered_model(model_name)

To get a specific version of a model, use the following code:

client.get_model_version(model_name, version=2)

Load models from registry

You can load models directly from the registry to restore logged models objects. For this task, use the functions mlflow.<flavor>.load_model() or mlflow.pyfunc.load_model() and indicate the URI of the model to load.

You can implement these functions with the following syntax:

  • models:/<model-name>/latest: Load the last version of the model.
  • models:/<model-name>/<version-number>: Load a specific version of the model.
  • models:/<model-name>/<stage-name>: Load a specific version in a given stage for a model. For more information, see Work with model stages.

To understand the differences between the functions mlflow.<flavor>.load_model() and mlflow.pyfunc.load_model(), see Workflows for loading MLflow models.

Work with model stages

MLflow supports stages for a model to manage the model lifecycle. The model version can transition from one stage to another. Stages are assigned to specific versions for a model. A model can have multiple versions on different stages.

Important

Stages can be accessed only by using the MLflow SDK. They aren't visible in the Azure Machine Learning studio. Stages can't be retrieved by using the Azure Machine Learning SDK, the Azure Machine Learning CLI, or the Azure Machine Learning REST API. Deployment from a specific model stage isn't currently supported.

Query model stages

The following code uses the MLflow client to check all possible stages for a model:

client.get_model_version_stages(model_name, version="latest")

You can see the model versions for each model stage by retrieving the model from the registry. The following code gets the model version that's currently in the Staging stage:

client.get_latest_versions(model_name, stages=["Staging"])

Multiple model versions can be in the same stage at the same time in MLflow. In the previous example, the method returns the latest (most recent) version among all versions for the stage.

Important

In the MLflow SDK, stage names are case sensitive.

Transition model version

Transitioning a model version to a particular stage can be done by using the MLflow client:

client.transition_model_version_stage(model_name, version=3, stage="Staging")

When you transition a model version to a particular stage, if the stage already has other model versions, the existing versions remain unchanged. This behavior applies by default.

Another approach is to set the archive_existing_versions=True parameter during the transition. This approach instructs MLflow to move any existing model versions to the stage Archived:

client.transition_model_version_stage(
    model_name, version=3, stage="Staging", archive_existing_versions=True
)

Load models from stages

You can load a model in a particular stage directly from Python by using the load_model function and the following URI format. For this method to succeed, all libraries and dependencies must be installed in your working environment.

Load the model from the Staging stage:

model = mlflow.pyfunc.load_model(f"models:/{model_name}/Staging")

Edit and delete models

Editing registered models is supported in both MLflow and Azure Machine Learning, but there are some important differences. The following sections describe some options.

Note

Renaming models isn't supported in Azure Machine Learning because model objects are immmutable.

Edit model description and tags

You can edit a model's description and tags by using the MLflow SDK:

client.update_model_version(model_name, version=1, description="My classifier description")

To edit tags, use the set_model_version_tag and remove_model_version_tag methods:

client.set_model_version_tag(model_name, version="1", key="type", value="classification")

To remove a tag, use the delete_model_version_tag method:

client.delete_model_version_tag(model_name, version="1", key="type")

Delete model version

You can delete any model version in the registry by using the MLflow client:

client.delete_model_version(model_name, version="2")

Note

Machine Learning doesn't support deleting the entire model container. To achieve this task, delete all model versions for a given model.

Review supported capabilities for managing models

The MLflow client exposes several methods to retrieve and manage models. The following table lists the methods currently supported in MLflow when connected to Azure Machine Learning. The table also compares MLflow with other models management capabilities in Azure Machine Learning.


Feature description
MLflow only Machine Learning with MLflow Machine Learning CLI v2 Machine Learning studio
Register models in MLflow format
Register models not in MLflow format
Register models from runs outputs/artifacts 1 2
Register models from runs outputs/artifacts in a different tracking server/workspace 5 5
Search/list registered models
Retrieving details of registered model's versions
Edit registered model's versions description
Edit registered model's versions tags
Rename registered models 3 3 3
Delete a registered model (container) 3 3 3
Delete a registered model's version
Manage MLflow model stages
Search registered models by name 4
Search registered models by using string comparators LIKE and ILIKE 4
Search registered models by tag 4
Organizational registries support

Table footnotes:

  • 1 Use Uniform Resource Identifiers (URIs) with the format runs:/<ruin-id>/<path>.
  • 2 Use URIs with the format azureml://jobs/<job-id>/outputs/artifacts/<path>.
  • 3 Registered models are immutable objects in Azure Machine Learning.
  • 4 Use the search box in Azure Machine Learning studio. Partial matching is supported.
  • 5 Use registries to move models across different workspaces and preserve lineage.