Configure MLflow for Azure Machine Learning
Azure Machine Learning workspaces are MLflow-compatible, which means they can act as an MLflow server without any extra configuration. Each workspace has an MLflow tracking URI that can be used by MLflow to connect to the workspace. Azure Machine Learning workspaces are already configured to work with MLflow so no extra configuration is required.
However, if you are working outside of Azure Machine Learning (like your local machine, Azure Synapse Analytics, or Azure Databricks) you need to configure MLflow to point to the workspace. In this article, you'll learn how you can configure MLflow to connect to an Azure Machine Learning for tracking, registries, and deployment.
Important
When running on Azure Compute (Azure Machine Learning Notebooks, Jupyter notebooks hosted on Azure Machine Learning Compute Instances, or jobs running on Azure Machine Learning compute clusters) you don't have to configure the tracking URI. It's automatically configured for you.
Prerequisites
You need the following prerequisites to follow this tutorial:
Install Mlflow SDK package
mlflow
and Azure Machine Learning plug-in for MLflowazureml-mlflow
.pip install mlflow azureml-mlflow
Tip
You can use the package
mlflow-skinny
, which is a lightweight MLflow package without SQL storage, server, UI, or data science dependencies. It is recommended for users who primarily need the tracking and logging capabilities without importing the full suite of MLflow features including deployments.You need an Azure Machine Learning workspace. You can create one following this tutorial.
If you're doing remote tracking (tracking experiments running outside Azure Machine Learning), configure MLflow to point to your Azure Machine Learning workspace's tracking URI as explained at Configure MLflow for Azure Machine Learning.
Configure MLflow tracking URI
To connect MLflow to an Azure Machine Learning workspace, you need the tracking URI for the workspace. Each workspace has its own tracking URI and it has the protocol azureml://
.
Get the tracking URI for your workspace:
APPLIES TO:
Azure CLI ml extension v2 (current)
Login and configure your workspace:
az account set --subscription <subscription> az configure --defaults workspace=<workspace> group=<resource-group> location=<location>
You can get the tracking URI using the
az ml workspace
command:az ml workspace show --query mlflow_tracking_uri
Configuring the tracking URI:
Then the method
set_tracking_uri()
points the MLflow tracking URI to that URI.import mlflow mlflow.set_tracking_uri(mlflow_tracking_uri)
Tip
When working on shared environments, like an Azure Databricks cluster, Azure Synapse Analytics cluster, or similar, it is useful to set the environment variable
MLFLOW_TRACKING_URI
at the cluster level to automatically configure the MLflow tracking URI to point to Azure Machine Learning for all the sessions running in the cluster rather than to do it on a per-session basis.
Configure authentication
Once the tracking is set, you'll also need to configure how the authentication needs to happen to the associated workspace. By default, the Azure Machine Learning plugin for MLflow will perform interactive authentication by opening the default browser to prompt for credentials.
The Azure Machine Learning plugin for MLflow supports several authentication mechanisms through the package azure-identity
, which is installed as a dependency for the plugin azureml-mlflow
. The following authentication methods are tried one by one until one of them succeeds:
- Environment: it reads account information specified via environment variables and use it to authenticate.
- Managed Identity: If the application is deployed to an Azure host with Managed Identity enabled, it authenticates with it.
- Azure CLI: if a user has signed in via the Azure CLI
az login
command, it authenticates as that user. - Azure PowerShell: if a user has signed in via Azure PowerShell's
Connect-AzAccount
command, it authenticates as that user. - Interactive browser: it interactively authenticates a user via the default browser.
For interactive jobs where there's a user connected to the session, you can rely on Interactive Authentication and hence no further action is required.
Warning
Interactive browser authentication will block code execution when prompting for credentials. It is not a suitable option for authentication in unattended environments like training jobs. We recommend to configure other authentication mode.
For those scenarios where unattended execution is required, you'll have to configure a service principal to communicate with Azure Machine Learning.
import os
os.environ["AZURE_TENANT_ID"] = "<AZURE_TENANT_ID>"
os.environ["AZURE_CLIENT_ID"] = "<AZURE_CLIENT_ID>"
os.environ["AZURE_CLIENT_SECRET"] = "<AZURE_CLIENT_SECRET>"
Tip
When working on shared environments, it is advisable to configure these environment variables at the compute. As a best practice, manage them as secrets in an instance of Azure Key Vault whenever possible. For instance, in Azure Databricks you can use secrets in environment variables as follows in the cluster configuration: AZURE_CLIENT_SECRET={{secrets/<scope-name>/<secret-name>}}
. See Reference a secret in an environment variable for how to do it in Azure Databricks or refer to similar documentation in your platform.
If you'd rather use a certificate instead of a secret, you can configure the environment variables AZURE_CLIENT_CERTIFICATE_PATH
to the path to a PEM
or PKCS12
certificate file (including private key) and
AZURE_CLIENT_CERTIFICATE_PASSWORD
with the password of the certificate file, if any.
Configure authorization and permission levels
Some default roles like AzureML Data Scientist or contributor are already configured to perform MLflow operations in an Azure Machine Learning workspace. If using a custom roles, you need the following permissions:
To use MLflow tracking:
Microsoft.MachineLearningServices/workspaces/experiments/*
.Microsoft.MachineLearningServices/workspaces/jobs/*
.
To use MLflow model registry:
Microsoft.MachineLearningServices/workspaces/models/*/*
Grant access for the service principal you created or user account to your workspace as explained at Grant access.
Troubleshooting authentication
MLflow will try to authenticate to Azure Machine Learning on the first operation interacting with the service, like mlflow.set_experiment()
or mlflow.start_run()
. If you find issues or unexpected authentication prompts during the process, you can increase the logging level to get more details about the error:
import logging
logging.getLogger("azure").setLevel(logging.DEBUG)
Set experiment name (optional)
All MLflow runs are logged to the active experiment. By default, runs are logged to an experiment named Default
that is automatically created for you. You can configure the experiment where tracking is happening.
Tip
When submitting jobs using Azure Machine Learning CLI v2, you can set the experiment name using the property experiment_name
in the YAML definition of the job. You don't have to configure it on your training script. See YAML: display name, experiment name, description, and tags for details.
To configure the experiment you want to work on use MLflow command mlflow.set_experiment()
.
experiment_name = 'experiment_with_mlflow'
mlflow.set_experiment(experiment_name)
Non-public Azure Clouds support
The Azure Machine Learning plugin for MLflow is configured by default to work with the global Azure cloud. However, you can configure the Azure cloud you are using by setting the environment variable AZUREML_CURRENT_CLOUD
.
import os
os.environ["AZUREML_CURRENT_CLOUD"] = "AzureChinaCloud"
You can identify the cloud you are using with the following Azure CLI command:
az cloud list
The current cloud has the value IsActive
set to True
.
Next steps
Now that your environment is connected to your workspace in Azure Machine Learning, you can start to work with it.
Feedback
Submit and view feedback for