Use managed identity to access mlflow models and artifacts

Question

Use managed identity to access mlflow models and artifacts

Tobias Quadfasel 75

Hello! I am new to Azure Databricks and have a question: In my current setup, I am running some containerized python code within an azure functions app. In this code, I need to download some models and artifacts stored via mlflow in our Azure Databricks workspace.

Previously, I have done this by setting DATABRICKS_HOST and DATABRICKS_TOKEN environment variables and then within my code I just set mlflow.set_tracking_uri("databricks") and all worked fine. However, the token is a PAT, which I do not like from a security perspective. Ideally, I would like to use the managed Identity of the functions app to authenticate with databricks. According to the following article, this should be possible: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/auth/azure-mi-auth

So I essentially repeated the steps in the article. Note that I omitted all account-level authorization steps, since workspace-level authorization is enough for my use case.

I created a user-assigned managed Identity in Azure
I assigned the managed identity to the functions app
I added a new entra ID managed service principal in my Azure Databricks workspace, using the client ID of the managed identity as application Id
I created the respective config file ~/.databrickscfg, adding a single profile with the name [AZURE_MI_WORKSPACE], containing the parameters host (my azure databricks workspace URL), azure_workspace_resource_id (resource ID of my azure databricks workspace), azure_client_id (the client ID of the managed Identity), azure_tenant_id (my azure tenant ID) and I set azure_use_msi to true, just as in the config in the referenced article above

Then, I changed my code to mlflow.set_tracking_uri("databricks://AZURE_MI_WORKSPACE"). The code proceeds to read the information from the .databrickscfg file, since I get the output

loading AZURE_MI_WORKSPACE profile from ~/.databrickscfg: host, azure_workspace_resource_id, azure_client_id, azure_use_msi, azure_tenant_id

But when setting the tracking uri, I get the following error:

Reading Databricks credential configuration failed with MLflow tracking URI 'databricks://AZURE_MI_WORKSPACE'. Please ensure that the 'databricks-sdk' PyPI library is installed, the tracking URI is set correctly, and Databricks authentication is properly configured. The tracking URI can be either 'databricks' (using 'DEFAULT' authentication profile) or 'databricks://{profile}'. You can configure Databricks authentication in several ways, for example by specifying environment variables (e.g. DATABRICKS_HOST + DATABRICKS_TOKEN) or logging in using 'databricks auth login'.

Do you have any leads what could be wrong here? I triple checked the parameters in the config files and they are definitely correct. I was asking myself if I made some kind of conceptual error and the mlflow tracking can't be done via managed identity auth for some reason.

Anonymous

2025-05-16T10:27:29.87+00:00
Hi @Tobias Quadfasel

This error indicates that MLflow cannot authenticate using the provided Databricks config profile, specifically one configured for managed identity.

Possible reasons:

Missing or incorrect installation of databricks-sdk, which is required for managed identity support in MLflow.

Malformed or incomplete ~/.databrickscfg file for the AZURE_MI_WORKSPACE profile.

Insufficient permissions of the service principal (managed identity) in the Databricks workspace.

Please check the below options:

Ensure that you are using the latest Databricks SDK (databricks-sdk)

pip install mlflow databricks-sdk

Verify that your configuration file is free of typos and contains accurate values.

[AZURE_MI_WORKSPACE] host = https://<your-workspace>.azuredatabricks.net azure_workspace_resource_id = /subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Databricks/workspaces/<workspace-name> azure_client_id = <client-id-of-your-managed-identity> azure_tenant_id = <your-tenant-id> azure_use_msi = true

I hope this information helps. Please do let us know if you have any further queries.

Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.

Thank you.
Anonymous

2025-05-19T12:23:55.85+00:00

@Tobias Quadfasel,
Just checking in to see if the below answer provided by @Sina Salam helped.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Answer accepted by question author

0 additional answers

Your answer

Anonymous

2025-05-16T10:27:29.87+00:00

Hi @Tobias Quadfasel

This error indicates that MLflow cannot authenticate using the provided Databricks config profile, specifically one configured for managed identity.

Possible reasons:

Missing or incorrect installation of databricks-sdk, which is required for managed identity support in MLflow.

Malformed or incomplete ~/.databrickscfg file for the AZURE_MI_WORKSPACE profile.

Insufficient permissions of the service principal (managed identity) in the Databricks workspace.

Please check the below options:

Ensure that you are using the latest Databricks SDK (databricks-sdk)

pip install mlflow databricks-sdk

Verify that your configuration file is free of typos and contains accurate values.

[AZURE_MI_WORKSPACE] host = https://<your-workspace>.azuredatabricks.net azure_workspace_resource_id = /subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Databricks/workspaces/<workspace-name> azure_client_id = <client-id-of-your-managed-identity> azure_tenant_id = <your-tenant-id> azure_use_msi = true

I hope this information helps. Please do let us know if you have any further queries.

Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.

Thank you.
Anonymous

2025-05-19T12:23:55.85+00:00

@Tobias Quadfasel,
Just checking in to see if the below answer provided by @Sina Salam helped.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Answer 1

Hello Tobias Quadfasel,

Welcome to the Microsoft Q&A and thank you for posting your questions here.

I understand that you are trying to use a managed identity in a way that is not fully supported by MLflow's current authentication flow.

MLflow does not natively support managed identity authentication via .databrickscfg alone. The below steps align with best practice guidance and are secure, scalable, and recommended for production environments: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/auth/azure-mi

Install Required Libraries using bash command: pip install azure-identity mlflow databricks-sdk

Manually acquire Azure AD token using Managed Identity using Python:

     from azure.identity import ManagedIdentityCredential
     import requests
     # Get token for Databricks
     credential = ManagedIdentityCredential(client_id="<your-client-id>")
     token = credential.get_token("2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default")
     # Use token in headers for Databricks REST API
     headers = {
         "Authorization": f"Bearer {token.token}"
     }

The resource ID 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d is the Azure Databricks App ID

Then, use REST API to download Artifacts or Models, since MLflow may not support this token directly, use the REST API:

     response = requests.get(
         "https://<your-databricks-instance>.azuredatabricks.net/api/2.0/mlflow/artifacts/download",
         headers=headers,
         params={"run_id": "<your-run-id>", "path": "<artifact-path>"}
     )

Alternatively, this step is a practical workaround, not officially documented as a supported method for managed identity with MLflow. If you want to try MLflow with token injection:

   import mlflow
   import os
   os.environ["DATABRICKS_HOST"] = "https://<your-databricks-instance>.azuredatabricks.net"
   os.environ["DATABRICKS_TOKEN"] = token.token  # Inject token manually
   mlflow.set_tracking_uri("databricks")

I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.

Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

Tobias Quadfasel 75 Reputation points

2025-05-16T12:22:36.99+00:00

Hi @Sina Salam ! Thank you for your answer! I have one follow-up question. You mentioned the following:

The resource ID 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d is the Azure Databricks App ID

Where to I find this ID? In Azure portal under properties section, I only see the resource ID (in the format /subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Databricks/workspaces/<workspace-name>) but as you can see, it looks different from what you posted. There is also a "Workspace Id" field, but it also looks different from your number, without the hyphens in between. Can you elaborate?
Sina Salam 26,666 Reputation points Volunteer Moderator

2025-05-16T12:36:49.4266667+00:00

Hello Tobias Quadfasel,

Thank you for your feedback.

Yes, the ID 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d is the Azure Databricks resource App ID (also known as the audience or aud claim in tokens). This ID is used when requesting an Azure Active Directory (Microsoft Entra ID) token for authenticating against Azure Databricks.

You don’t need to look for it manually because it is a well-known static identifier for Azure Databricks. Microsoft documents it explicitly here: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/auth/app-aad-token

It represents the programmatic ID for Azure Databricks (2ff814a6-3304-4ab8-85cb-cd0e6f879c1d) along with the default scope (/.default)”.

Success.
Tobias Quadfasel 75 Reputation points

2025-05-19T16:11:25.55+00:00

Hi @Sina Salam ! Your suggestion worked like a charm! Even the token injection works perfectly! A big thank you for the detailled and great answer!

Share via

Use managed identity to access mlflow models and artifacts

0 additional answers

Your answer