How to access ADLS Gen 2 storage account from azure databricks notebook using UAMI dbmanagedidentity?

Banerjee, Swarnava 0 Reputation points
2025-05-07T13:43:58.2533333+00:00

I am trying to connect to ADLS gen2 storage account from azure databricks notebook using user assigned managed identity "dbmanagedidentity".

Please note "dbmanagedidentity" is having "Storage Blob Data Contributor" access under ADLS gen2 storage account.

Can I get a detailed step of how to achieve this? Any documentation or code snippet will be helpful.

Thanks in advance

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,430 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Michele Ariis 335 Reputation points MVP
    2025-05-08T14:48:49.8933333+00:00

    Hi Swarnava, I will tell you how I usually do it, “by hand”:

    Assign UAMI to the cluster

    Go to your Databricks workspace in Azure Portal → “Clusters” → choose the cluster → “Configuration” → “Advanced” → “Identity” tab → enable User Assigned Managed Identity and add dbmanagedidentity.

    Verify on the cluster

    Start it and check in the startup logs that the identity is mounted (you should see a message “Managed identity with client ID … attached”).

    Prepare configs for ABFS

    In the notebook define a dict like this (replace <ACCOUNT>, <CLIENT_ID> and <CONTAINER>):

    configs = {

    "fs.azure.account.auth.type.<ACCOUNT>.dfs.core.windows.net": "OAuth",

    "fs.azure.account.oauth.provider.type.<ACCOUNT>.dfs.core.windows.net":

    "org.apache.hadoop.fs.azurebfs.oauth2.ManagedIdentityTokenProvider",

    "fs.azure.account.oauth2.msi.client.id.<ACCOUNT>.dfs.core.windows.net":

    "<CLIENT_ID>"

    }

    Mount the filesystem

    dbutils.fs.mount(

    source = "abfss://<CONTAINER>@<ACCOUNT>.dfs.core.windows.net/",

    mount_point = "/mnt/mydata",

    extra_configs = configs

    )

    Use it

    display(dbutils.fs.ls("/mnt/mydata"))

    df = spark.read.parquet("/mnt/mydata/folder/file.parquet")

    From there the dbmanagedidentity (with Storage Blob Data Contributor role) automatically authenticates calls to ADLS Gen2, without keys or secrets.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.