Can't access blob storage using Hadoop SDK and Azure MSI authentication

Sergey Shabalov 46 Reputation points
2021-11-01T13:07:04.473+00:00

I use a simple code

        conf.set("fs.defaultFS","wasbs://sshcont01@sshblobhierarchyoff.blob.core.windows.net");
        conf.set("fs.azure.ssl.channel.mode","Default_JSSE");
        conf.set("fs.azure.account.oauth.provider.type","org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider");
        conf.set("fs.azure.account.auth.type","OAuth");
        conf.set("fs.azure.account.oauth2.msi.tenant","My Tenant ID");
        conf.set("fs.azure.account.oauth2.client.id","My Client ID");
        FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf);

but I have got:

org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.AzureException: No credentials found for account sshblobhierarchyoff.blob.core.windows.net in the configuration, and its container sshcont01 is not accessible using anonymous credentials. Please check if the container exists first. If it is not publicly available, you have to provide account credentials.
Error [18:01:18] at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:1098)
Error [18:01:18] at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:547)
Error [18:01:18] at org.apache.hadoop.fs.azure.NativeAzureFilSystem.initialize(NativeAzureFileSystem.java:1379)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:288)

But if I set "access level: "container" for this container it passes well. We suppose the MSI should provide access to the container without making it public.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,335 questions
Microsoft Entra ID
Microsoft Entra ID
A Microsoft Entra identity service that provides identity management and access control capabilities. Replaces Azure Active Directory.
19,380 questions
0 comments No comments
{count} vote

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 76,511 Reputation points Microsoft Employee
    2021-11-03T10:42:10.48+00:00

    Hello @Sergey Shabalov ,

    Thanks for the question and using MS Q&A platform.

    This is an excepted behaviour while you set access level to private permissions when using an Azure Storage account with Hadoop.

    Note: Private containers in storage accounts that are NOT connected to a cluster: You can't access the blobs in the containers unless you define the storage account in the Hadoop configuration i.e., core-site.xml file.

    For your understanding, I have created three containers as following;

    146191-image.png

    If you access containers using HDInsight, you will get the same error message for private and blob public access level and gives desired output for the Container public access level.

    146157-image.png

    For more details, refer “HDInsight Storage architecture” and “Hadoop Azure Support: Azure Blob Storage”.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

1 additional answer

Sort by: Most helpful
  1. Vashisht, Tanishka 0 Reputation points
    2023-10-09T04:22:07.7333333+00:00

    But after providing Container:access level I am not able to write files to container. Following error is coming:

    [ERROR] org.apache.hadoop.fs.azure.AzureException: Uploads to to public accounts using anonymous access is prohibited.

    I am here trying to connect to Haddop using service principle authentication

    configMap.set("fs.defaultFS", "wasbs://" + container + "@" + accountName + ".blob."+ endPointSuffix + "/");
    		configMap.put("fs.azure.account.auth.type", "OAuth");
    		configMap.set("fs.azure.account.oauth2.client.endpoint", "https://login.microsoftonline.com/" + tenantId + "/oauth2/token");
    		configMap.set("fs.azure.account.oauth2.client.id", clientId);
    		configMap.set("fs.azure.account.oauth2.client.secret", clientSecret);