How databrick save ml model into Azure Storage Container?

2023-01-13T04:08:01.9033333+00:00

I am trying to use mlflow package in databricks to save the model into Azure Storage.

The Script:

abfss_path='abfss://mlops@dlsgdpeasdev03.dfs.core.windows.net'

project = 'test'

model_version = 'v1.0.1'

model = {model training step}
prefix_model_path = os.path.join(abfss_path, project, model_version)

model_path = prefix_model_path

print(model_path) # abfss://mlops@dlsgdpeasdev03.dfs.core.windows.net/test/v1.0.1

mlflow.sklearn.save_model(model, model_path)

The message is successfully save the model.

When I check the container and file does not exist, but I am able to load model with the same path. That mean the model file saved in databricks somewhere.

I want to know where is the model file in databricks, and how to save the model directly from databricks notebook to Azure Storage.

Thanks

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
1,635 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,201 questions
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 17,566 Reputation points Microsoft Employee
    2023-01-16T17:26:27.9033333+00:00

    Hello @Benny Lau ,Shui Hong - Group Office ,

    Thanks for the ask and welcome to Microsoft Q&A .

    As I understand the ask here is to where the model is saved and how you can save to the blob .

    As per the document here : [https://learn.microsoft.com/en-us/azure/databricks/mlflow/models#api-commands

    You have three option and I assume that your model file is getting stored in the DBFS on the Azure databricks cluster .

    User's image

    Databricks can save a machine learning model to an Azure Storage Container using the dbutils.fs module. This module provides a set of functions for interacting with the Databricks file system (DBFS) and Azure Blob Storage. Here is an example of how to save a model to an Azure Storage Container:

    1. First, you will need to mount the Azure Storage Container to DBFS, this can be done using the dbutils.fs.mount function.
    dbutils.fs.mount(
      source='wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net',
      mount_point='/mnt/<your-mount-point>',
      extra_configs={
        "fs.azure.account.auth.type": "OAuth",
        "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
        "fs.azure.account.oauth2.client.id": "<your-client-id>",
        "fs.azure.account.oauth2.client.secret": "<your-client-secret>",
        "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/<your-tenant-id>/oauth2/token"
      }
    )
    
    
    1. Once the container is mounted, you can use the dbutils.fs.cp function to copy the model from the local file system to the mount point.

    dbutils.fs.cp("path/to/local/model", "/mnt/<your-mount-point>/model")

    1. You can also use model.save() method to save the model in the mounted container path

    model.save("/mnt/<your-mount-point>/model")

    Note: Be sure to replace the placeholders in the above code with the appropriate values for your use case.

    No comments