Mounting entire ADLS on azure databricks

Ashima Gupta 1 Reputation point
2020-09-23T11:35:43.993+00:00

Hi,

I want to mount entire ADLS storage account on databricks.

I've checked in documents I can mount a single filesystem at a time but I want to mount entire ADLS on databricks. I've around 70 containers in my ADLS and I want to mount all of them in one go.

I found below code to mount single file-system:

val configs = Map(
"fs.azure.account.auth.type" -> "OAuth",
"fs.azure.account.oauth.provider.type" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id" -> "<application-id>",
"fs.azure.account.oauth2.client.secret" -> dbutils.secrets.get(scope="<scope-name>",key="<service-credential-key-name>"),
"fs.azure.account.oauth2.client.endpoint" -> "https://login.microsoftonline.com/<directory-id>/oauth2/token")

// Optionally, you can add <directory-name> to the source URI of your mount point.
dbutils.fs.mount(
source = "abfss://<file-system-name>@<storage-account-name>.dfs.core.windows.net/",
mountPoint = "/mnt/<mount-name>",
extraConfigs = configs)

This is how I've mounted single file system but now I want to mount entire adls on databricks.

Please provide me guidance on how to accomplish this task.

Regards,
Ashima

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,389 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,005 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,041 Reputation points
    2020-09-23T18:32:27.26+00:00

    Hello Ashima and welcome to Microsoft Q&A.

    I think you are referring to mounting ADLS gen2, not gen1 or blob Is this correct? Is scala your preferred language?

    As far as I know there is no "bulk mount". However I think I have a work-around.

    We can use the adls api's to get a list of the file systems. Then iterate over the list, mounting each in turn. This is made much easier if you are using the same credentials for all the operations.

    For the retrieval of the list, you can use the REST api + the language-appropriate HTTPS library, or a SDK available in Java SDK or Python.

    Thank you.

    1 person found this answer helpful.