question

VICTORSPILCHUK-0256 avatar image
0 Votes"
VICTORSPILCHUK-0256 asked SergeyShabalov-3062 published

How to access blob storage using Hadoop SDK and Azure MSI authentication

Hi All !

I try to use org.apache.hadoop.fs.FileSystem.get(Config...); method to get on azure storages.

In case of Azure Data Lake Gen2 I use URI like:

    abfs://mydlfilesystem@mydlaccount.dfs.core.windows.net/my_path

and set properties:

fs.defaultFS = "abfs://mydlfilesystem@mydlaccount.dfs.core.windows.net/my_path"
fs.adl.oauth2.access.token.provider.type = "ClientCredential"
fs.azure.ssl.channel.mode = Default_JSSE"
fs.azure.account.oauth.provider.type = "org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider"
fs.azure.account.auth.type = "OAuth"
fs.azure.account.oauth2.msi.tenant = "MyTenantId"
fs.azure.account.oauth2.client.id = "MyClientId"

It does work properly.

But If I try to access Azure Blob Storage. I use URI like:

    wasb://mycont@mybsaccount.dfs.core.windows.net/my_path

and the same config properties. I have got:

org.apache.hadoop.fs.azure.AzureException: No credentials found for account mycont@mybsaccount.dfs.core.windows.net in the configuration, and its container mycont is not accessible using anonymous credentials. Please check if the container exists first. If it is not publicly available, you have to provide account credentials.


I would like to know how to access Blob Storage using MSI authentication via Hadoop SDK.

Thanks,
Sergiy






azure-blob-storage
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@VICTORSPILCHUK-0256 Just for clarity: Have you referred to this link https://hadoop.apache.org/docs/current/hadoop-azure/index.html

Your storage location should be wasb://container@storageacctname.blob.core.windows.net/xyxyxy_path
You may also refer to the suggestion mentioned in this SO thread and let us know the status

Looking forward for your response!


0 Votes 0 ·

I did try all possible variants like:

 wasb://container@storageacctname.blob.core.windows.net/xyxyxy_path
 wasbs://container@storageacctname.blob.core.windows.net/xyxyxy_path
 wasb://container@storageacctname.dfs.core.windows.net/xyxyxy_path
 wasbs://container@storageacctname.dfs.core.windows.net/xyxyxy_path

but the same result:

org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.AzureException: No credentials found for account sshblobhierarchyon.dfs.core.windows.net in the configuration, and its container sshcont01 is not accessible using anonymous credentials. Please check if the container exists first. If it is not publicly available, you have to provide account credentials

About "SO thread" - it is not relevant to MSI authentication.

Seems like the Azure blob storage with MSI is not compatible with Hadoop WASB unit 3.3.0 :(




0 Votes 0 ·

@VICTORSPILCHUK-0256 Firstly, apologies for the delay in responding here and any inconvenience this issue may have caused. If the issue still persists, I wish to engage with you offline for a closer look and provide quick and specialized assistance, please send an email with the subject line “Attn:subm” to AzCommunity[at]Microsoft[dot]com referencing this thread and the Azure subscription ID, I will follow-up with you. Once again, apologies for any inconvenience with this issue.

Thank you for your cooperation on this matter and look forward to your reply.

0 Votes 0 ·

1 Answer

SergeyShabalov-3062 avatar image
0 Votes"
SergeyShabalov-3062 answered SergeyShabalov-3062 published

Hello Sumarigo.

I have to renew our development using Hadop+Azure Blob Storage.
The issue was described in:

https://docs.microsoft.com/en-us/answers/questions/151715/how-to-access-blob-storage-using-hadoop-sdk-and-az.html

I created a refine simple code

         conf.set("fs.defaultFS","wasbs://sshcont01@sshblobhierarchyoff.blob.core.windows.net");
         conf.set("fs.azure.ssl.channel.mode","Default_JSSE");
         conf.set("fs.azure.account.oauth.provider.type","org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider");
         conf.set("fs.azure.account.auth.type","OAuth");
         conf.set("fs.azure.account.oauth2.msi.tenant","435f6b6d-8ec9-44e7-a0eb-86178d0c18eb");
         conf.set("fs.azure.account.oauth2.client.id","0a2e18d3-56f4-4253-8730-e3c7dff5664a");
         FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf);

but I still have got:

org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.AzureException: No credentials found for account sshblobhierarchyoff.blob.core.windows.net in the configuration, and its container sshcont01 is not accessible using anonymous credentials. Please check if the container exists first. If it is not publicly available, you have to provide account credentials.
Error [18:01:18] at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:1098)
Error [18:01:18] at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:547)
Error [18:01:18] at org.apache.hadoop.fs.azure.NativeAzureFilSystem.initialize(NativeAzureFileSystem.java:1379)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
Error [18:01:18] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:288)

But if I set "access level: container" for this container it passes well. We suppose the MSI should provide access to the container without making it public.

Could you clarify this?

Thanks
Sergiy

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.