question

SergeyShabalov-3062 avatar image
0 Votes"
SergeyShabalov-3062 asked PRADEEPCHEEKATLA-MSFT commented

Can't access blob storage using Hadoop SDK and Azure MSI authentication

I use a simple code

         conf.set("fs.defaultFS","wasbs://sshcont01@sshblobhierarchyoff.blob.core.windows.net");
         conf.set("fs.azure.ssl.channel.mode","Default_JSSE");
         conf.set("fs.azure.account.oauth.provider.type","org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider");
         conf.set("fs.azure.account.auth.type","OAuth");
         conf.set("fs.azure.account.oauth2.msi.tenant","My Tenant ID");
         conf.set("fs.azure.account.oauth2.client.id","My Client ID");
         FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf);

but I have got:

org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.AzureException: No credentials found for account sshblobhierarchyoff.blob.core.windows.net in the configuration, and its container sshcont01 is not accessible using anonymous credentials. Please check if the container exists first. If it is not publicly available, you have to provide account credentials.
Error [18:01:18] at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:1098)
Error [18:01:18] at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:547)
Error [18:01:18] at org.apache.hadoop.fs.azure.NativeAzureFilSystem.initialize(NativeAzureFileSystem.java:1379)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:288)

But if I set "access level: "container" for this container it passes well. We suppose the MSI should provide access to the container without making it public.

azure-data-lake-storageazure-managed-identity
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

PRADEEPCHEEKATLA-MSFT avatar image
0 Votes"
PRADEEPCHEEKATLA-MSFT answered PRADEEPCHEEKATLA-MSFT commented

Hello @SergeyShabalov-3062,

Thanks for the question and using MS Q&A platform.

This is an excepted behaviour while you set access level to private permissions when using an Azure Storage account with Hadoop.

Note: Private containers in storage accounts that are NOT connected to a cluster: You can't access the blobs in the containers unless you define the storage account in the Hadoop configuration i.e., core-site.xml file.

For your understanding, I have created three containers as following;

146191-image.png

If you access containers using HDInsight, you will get the same error message for private and blob public access level and gives desired output for the Container public access level.

146157-image.png

For more details, refer “HDInsight Storage architecture” and “Hadoop Azure Support: Azure Blob Storage”.

Hope this will help. Please let us know if any further queries.


  • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how

  • Want a reminder to come back and check responses? Here is how to subscribe to a notification

  • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators


image.png (246.0 KiB)
image.png (1.3 MiB)
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

But if I do the same for DataLake Gen2 storage I have access to all containers despite their Access level:
146256-image.png


Why there is so strange differences between Blob and DataLake storage ?

0 Votes 0 ·
image.png (27.3 KiB)

Hello @SergeyShabalov-3062,

This is an excepted behaviour with Blob storage.

Azure Data Lake Storage Gen2 implements an access control model that supports both Azure role-based access control (Azure RBAC) and POSIX-like access control lists (ACLs).

When allow blob public access is enabled, one is permitted to configure container ACLs to allow anonymous access to blobs within the storage account. When disabled, no anonymous access to blobs within the storage account is permitted, regardless of underlying ACL configurations.

0 Votes 0 ·