Deploy MLflow models

APPLIES TO: Azure CLI ml extension v2 (current)

In this article, learn how to deploy your MLflow model to Azure ML for both real-time and batch inference. Azure ML supports no-code deployment of models created and logged with MLflow. This means that you don't have to provide a scoring script or an environment.

For no-code-deployment, Azure Machine Learning

  • Dynamically installs Python packages provided in the conda.yaml file, this means the dependencies are installed during container runtime.
  • Provides a MLflow base image/curated environment that contains the following items:


If you are used to deploying models using scoring scripts and custom environments and you are looking to know how to achieve the same functionality using MLflow models, we recommend reading Using MLflow models for no-code deployment.


For information about inputs format and limitation in online endpoints, view Considerations when deploying to real-time inference. For more information about the supported file types in batch endpoints, view Considerations when deploying to batch inference.

Deployment tools

There are three workflows for deploying MLflow models to Azure Machine Learning:

Each workflow has different capabilities, particularly around which type of compute they can target. The following table shows them:

Scenario MLflow SDK Azure ML CLI/SDK v2 Azure ML studio
Deploy MLflow models to managed online endpoints
Deploy MLflow models to managed batch endpoints
Deploy MLflow models to ACI/AKS
Deploy MLflow models to ACI/AKS (with a scoring script) 1


  • 1 No-code deployment is not supported when deploying to ACI/AKS from Azure ML studio. We recommend switching to our managed online endpoints instead.

Which option to use?

If you are familiar with MLflow or your platform support MLflow natively (like Azure Databricks) and you wish to continue using the same set of methods, use the azureml-mlflow plugin. On the other hand, if you are more familiar with the Azure ML CLI v2, you want to automate deployments using automation pipelines, or you want to keep deployments configuration in a git repository; we recommend you to use the Azure ML CLI v2. If you want to quickly deploy and test models trained with MLflow, you can use Azure Machine Learning studio UI deployment.

Deploy using the MLflow plugin

The MLflow plugin azureml-mlflow can deploy models to Azure ML, either to Azure Kubernetes Service (AKS), Azure Container Instances (ACI) and Managed Endpoints for real-time serving.


Deploying to managed batch endpoints is not supported in the MLflow plugin at the moment.


  • Install the azureml-mlflow package.
  • If you are running outside an Azure ML compute, configure the MLflow tracking URI or MLflow's registry URI to point to the workspace you are working on. For more information about how to Set up tracking environment, see Track runs using MLflow with Azure Machine Learning for more details.


  1. Ensure your model is registered in Azure Machine Learning registry. Deployment of unregistered models is not supported in Azure Machine Learning. You can register a new model using the MLflow SDK:

    mlflow.register_model(f"runs:/{run_id}/{artifact_path}", "sample-sklearn-mlflow-model")
  2. Deployments can be generated using both the Python SDK for MLflow or MLflow CLI. In both cases, a JSON configuration file can be indicated with the details of the deployment you want to achieve. If not indicated, then a default deployment is done using Azure Container Instances (ACI) and a minimal configuration.

        "instance_type": "Standard_DS2_v2",
        "instance_count": 1,


    The full specification of this configuration can be found at Managed online deployment schema (v2).

  3. Save the deployment configuration to a file:

    import json
    deploy_config = {
       "instance_type": "Standard_DS2_v2",
       "instance_count": 1,
    deployment_config_path = "deployment_config.json"
    with open(deployment_config_path, "w") as outfile:
  4. Create a deployment client using the Azure Machine Learning Tracking URI.

    from mlflow.deployments import get_deploy_client
    # Set the tracking uri in the deployment client.
    client = get_deploy_client("<azureml-mlflow-tracking-url>")
  5. Run the deployment

    model_name = "mymodel"
    model_version = 1
    # define the model path and the name is the service name
    # if model is not registered, it gets registered automatically and a name is autogenerated using the "name" parameter below
       config={ "deploy-config-file": deployment_config_path },

Deploy using Azure ML CLI/SDK (v2)

You can use Azure ML CLI/SDK v2 to deploy models trained and logged with MLflow to managed endpoints (Online/batch). Deployment of MLflow models support no-code-deployment, so you don't have to provide a scoring script or an environment, but you can if needed.


Before following the steps in this article, make sure you have the following prerequisites:


This example shows how you can deploy an MLflow model to an online endpoint using CLI (v2).


For MLflow no-code-deployment, testing via local endpoints is currently not supported.

  1. Connect to Azure Machine Learning workspace

    az account set --subscription <subscription>
    az configure --defaults workspace=<workspace> group=<resource-group> location=<location>
  2. The following example configures the name and authentication mode of the endpoint:

    Create a YAML configuration file for your endpoint:


    name: my-endpoint
    auth_mode: key
  3. Execute the endpoint creation. This operation will create the endpoint in the Azure Machine Learning workspace:

    To create a new endpoint using the YAML configuration, use the following command:

    az ml online-endpoint create --name $ENDPOINT_NAME -f endpoints/online/mlflow/create-endpoint.yaml
  4. Before going further, we need to register the model we want to deploy. Deployment of unregistered models is not supported in Azure Machine Learning.

    We first need to register the model we want to deploy. Deployment of unregistered models is not supported in Azure Machine Learning.

    From a training job

    In this example, the model is registered from a job previously run. Assuming that your model was registered with an instruction similar like this:

    mlflow.sklearn.log_model(scikit_model, "model")

    To register the model from a previous run we would need the job name/run ID in question. For simplicity, let's assume that we are looking to register the model trained in the last run submitted to the workspace:

    JOB_NAME=$(az ml job list --query "[0].name" | tr -d '"')

    Then, let's register the model in the registry.

    az ml model create --name "mir-sample-sklearn-mlflow-model" \
                       --type "mlflow_model" \
                       --path "azureml://jobs/$JOB_NAME/outputs/artifacts/model"

    From a local model

    If your model is located in the local file system or compute, then you can register it as follows:

    az ml model create --name "mir-sample-sklearn-mlflow-model" \
                       --type "mlflow_model" \
                       --path "sklearn-diabetes/model"
  5. Once the endpoint is created, we need to create a deployment on it. Remember that endpoints can contain one or multiple deployments and traffic can be configured for each of them. In this example, we are going to create only one deployment to serve all the traffic, named sklearn-deployment.

    Create the deployment YAML file:


    name: sklearn-deployment
    endpoint_name: my-endpoint
    model: azureml:mir-sample-sklearn-mlflow-model@latest
    instance_type: Standard_DS2_v2
    instance_count: 1
  6. Create the deployment and assign all the traffic to it.

    az ml online-deployment create --name sklearn-deployment --endpoint $ENDPOINT_NAME -f endpoints/online/mlflow/sklearn-deployment.yaml --all-traffic
  7. Once the deployment is completed, the service is ready to receive requests. If you are not sure about how to submit requests to the service, see Creating requests.

Deploy using Azure Machine Learning studio

You can use Azure Machine Learning studio to deploy models to Managed Online Endpoints.


Although deploying to ACI or AKS with Azure Machine Learning studio is possible, no-code deployment feature is not available for these compute targets. We recommend the use of managed online endpoints as it provides a superior set of features.

  1. Ensure your model is registered in the Azure Machine Learning registry. Deployment of unregistered models is not supported in Azure Machine Learning. You can register models from files in the local file system or from the output of a job:

    You can register the model directly from the job's output using Azure Machine Learning studio. To do so, navigate to the Outputs + logs tab in the run where your model was trained and select the option Create model.

    Animated gif that demonstrates how to register a model directly from outputs.

  2. From studio, select your workspace and then use either the endpoints page to create the endpoint deployment:

    a. From the Endpoints page, Select +Create.

    Screenshot showing create option on the Endpoints UI page.

    b. Provide a name and authentication type for the endpoint, and then select Next.

    c. When selecting a model, select the MLflow model registered previously. Select Next to continue.

    d. When you select a model registered in MLflow format, in the Environment step of the wizard, you don't need a scoring script or an environment.

    Screenshot showing no code and environment needed for MLflow models.

    e. Complete the wizard to deploy the model to the endpoint.

    Screenshot showing NCD review screen.

Considerations when deploying to real time inference

The following input's types are supported in Azure ML when deploying models with no-code deployment. Take a look at Notes in the bottom of the table for additional considerations.

Input type Support in MLflow models (serve) Support in Azure ML
JSON-serialized pandas DataFrames in the split orientation
JSON-serialized pandas DataFrames in the records orientation 1
CSV-serialized pandas DataFrames 2
Tensor input format as JSON-serialized lists (tensors) and dictionary of lists (named tensors)
Tensor input formatted as in TF Serving’s API


  • 1 We suggest you to use split orientation instead. Records orientation doesn't guarante column ordering preservation.
  • 2 We suggest you to explore batch inference for processing files.

Regardless of the input type used, Azure Machine Learning requires inputs to be provided in a JSON payload, within a dictionary key input_data. Note that such key is not required when serving models using the command mlflow models serve and hence payloads can't be used interchangeably.

Creating requests

Your inputs should be submitted inside a JSON payload containing a dictionary with key input_data.

Payload example for a JSON-serialized pandas DataFrames in the split orientation

    "input_data": {
        "columns": [
            "age", "sex", "trestbps", "chol", "fbs", "restecg", "thalach", "exang", "oldpeak", "slope", "ca", "thal"
        "index": [1],
        "data": [
            [1, 1, 145, 233, 1, 2, 150, 0, 2.3, 3, 0, 2]

Payload example for a tensor input

    "input_data": [
          [1, 1, 0, 233, 1, 2, 150, 0, 2.3, 3, 0, 2],
          [1, 1, 0, 233, 1, 2, 150, 0, 2.3, 3, 0, 2]
          [1, 1, 0, 233, 1, 2, 150, 0, 2.3, 3, 0, 2],
          [1, 1, 145, 233, 1, 2, 150, 0, 2.3, 3, 0, 2]

Payload example for a named-tensor input

    "input_data": {
        "tokens": [
          [0, 655, 85, 5, 23, 84, 23, 52, 856, 5, 23, 1]
        "mask": [
          [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]


The following limitations apply to real time inference deployments:


Consider the following limitations when deploying MLflow models to Azure Machine Learning:

  • Spark flavor is not supported at the moment for deployment.
  • Data type mlflow.types.DataType.Binary is not supported as column type in signatures. For models that work with images, we suggest you to use or (a) tensors inputs using the TensorSpec input type, or (b) Base64 encoding schemes with a mlflow.types.DataType.String column type, which is commonly used when there is a need to encode binary data that needs be stored and transferred over media.
  • Signatures with tensors with unspecified shapes (-1) is only supported at the batch size by the moment. For instance, a signature with shape (-1, -1, -1, 3) is not supported but (-1, 300, 300, 3) is.

Considerations when deploying to batch inference

Azure Machine Learning supports no-code deployment for batch inference in managed endpoints. This represents a convenient way to deploy models that require processing of big amounts of data in a batch-fashion.

How work is distributed on workers

Work is distributed at the file level, for both structured and unstructured data. As a consequence, only file datasets or URI folders are supported for this feature. Each worker processes batches of Mini batch size files at a time. Further parallelism can be achieved if Max concurrency per instance is increased.


Nested folder structures are not explored during inference. If you are partitioning your data using folders, make sure to flatten the structure beforehand.

File's types support

The following data types are supported for batch inference.

File extension Type returned as model's input Signature requirement
.csv pd.DataFrame ColSpec. If not provided, columns typing is not enforced.
.png, .jpg, .jpeg, .tiff, .bmp, .gif np.ndarray TensorSpec. Input is reshaped to match tensors shape if available. If no signature is available, tensors of type np.uint8 are inferred.

Next steps

To learn more, review these articles: