Getting unauthorized response with DataLake Gen 2 link

Jeff 1 Reputation point
2022-02-02T13:07:37.803+00:00

I'm having trouble with some simple data queries. I'm using an account key to authorize connections to Data Lake storage Gen 2. First I set the account key (I know not best practice storing key in notebook but I'm just trying to get something running):

spark.conf.set(
  "fs.azure.account.key.mystorage.dfs.core.windows.net",
  "031...=")

But the following fails with an error:

df = spark.read.text("abfs://demo@mystorage.dfs.core.windows.net/sample_data.csv")

Py4JJavaError: An error occurred while calling o324.text.
: Operation failed: "This request is not authorized to perform this operation.", 403, HEAD, https:/mystorage.dfs.core.windows.net/demo/?upn=false&action=getAccessControl&timeout=90
    at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:241)
    at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:767)
    at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:749)

I have been able to access this data from a Jupyter Notebook running on a VM using a SAS token. Any ideas?

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,352 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,946 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. ShaikMaheer-MSFT 37,971 Reputation points Microsoft Employee
    2022-02-03T17:21:23.637+00:00

    Hi @Jeff ,

    Thank you for posting query in Microsoft Q&A Platform.

    You can access ADLS Gen2 using OAuth 2.0 Or using SAS. When it comes to Blob storage then you can do that using Account key.

    So, If in your case your storage is ADLS gen2 then kindly consider using OAuth 2.0 or SAS mechanism.

    Click here to know about complete detailed steps to Access ADLS Gen2 using OAuth 2.0
    Click here to know about complete detailed steps to Access ADLS Gen2 using SAS
    Click here to know about accessing Azure Blob storage.

    Hope this helps. Please let us know if any further queries.

    --------------

    Please consider hitting Accept Answer. Accepted answers helps community as well.


  2. Jeff 1 Reputation point
    2022-02-04T14:40:15.557+00:00

    For some reason I couldn't comment on your post. I created a new app registration, generated a client secret, then gave the app "Contributor" role on the storage account. I then tried using OAuth2:

    configs = {"fs.azure.account.auth.type": "OAuth",
           "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
           "fs.azure.account.oauth2.client.id": "xxx...",
           "fs.azure.account.oauth2.client.secret": "xxx...",
           "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/xxx.../oauth2/token",
           "fs.azure.createRemoteFileSystemDuringInitialization": "true"}
    
    dbutils.fs.mount(
    source = "abfss://demo@mystorage.dfs.core.windows.net/",
    mount_point = "/mnt/demo",
    extra_configs = configs)
    

    This still throws an exeception:

    ExecutionError: An error occurred while calling o298.mount.
    : Operation failed: "This request is not authorized to perform this operation.", 403, PUT, https://mystorage.dfs.core.windows.net/demo?resource=filesystem, AuthorizationFailure, "This request is not authorized to perform this operation. RequestId:4838ca85-d01f-0025-55d2-194358000000 Time:2022-02-04T14:20:09.0319452Z"
        at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:241)
        at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsClient.createFilesystem(AbfsClient.java:186)
    

    Considering the other method, the SAS doc page is missing information on how to do the SAS auth in Python.