How to access blob storage using Hadoop SDK and Azure MSI authentication

VICTOR SPILCHUK 1 Reputation point
2020-11-04T20:35:25.56+00:00

Hi All !

I try to use org.apache.hadoop.fs.FileSystem.get(Config...); method to get on azure storages.

In case of Azure Data Lake Gen2 I use URI like:

   abfs://******@mydlaccount.dfs.core.windows.net/my_path

and set properties:

fs.defaultFS = "abfs://******@mydlaccount.dfs.core.windows.net/my_path"
fs.adl.oauth2.access.token.provider.type = "ClientCredential"
fs.azure.ssl.channel.mode = Default_JSSE"
fs.azure.account.oauth.provider.type = "org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider"
fs.azure.account.auth.type = "OAuth"
fs.azure.account.oauth2.msi.tenant = "MyTenantId"
fs.azure.account.oauth2.client.id = "MyClientId"

It does work properly.

But If I try to access Azure Blob Storage. I use URI like:

   wasb://******@mybsaccount.dfs.core.windows.net/my_path

and the same config properties. I have got:

org.apache.hadoop.fs.azure.AzureException: No credentials found for account ******@mybsaccount.dfs.core.windows.net in the configuration, and its container mycont is not accessible using anonymous credentials. Please check if the container exists first. If it is not publicly available, you have to provide account credentials.

I would like to know how to access Blob Storage using MSI authentication via Hadoop SDK.

Thanks,
Sergiy

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,192 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Sergey Shabalov 46 Reputation points
    2021-10-28T16:41:25.61+00:00

    Hello Sumarigo.

    I have to renew our development using Hadop+Azure Blob Storage.
    The issue was described in:

    https://learn.microsoft.com/en-us/answers/questions/151715/how-to-access-blob-storage-using-hadoop-sdk-and-az.html

    I created a refine simple code

            conf.set("fs.defaultFS","wasbs://******@sshblobhierarchyoff.blob.core.windows.net");  
            conf.set("fs.azure.ssl.channel.mode","Default_JSSE");  
            conf.set("fs.azure.account.oauth.provider.type","org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider");  
            conf.set("fs.azure.account.auth.type","OAuth");  
            conf.set("fs.azure.account.oauth2.msi.tenant","435f6b6d-8ec9-44e7-a0eb-86178d0c18eb");  
            conf.set("fs.azure.account.oauth2.client.id","0a2e18d3-56f4-4253-8730-e3c7dff5664a");  
            FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf);  
    

    but I still have got:

    org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.AzureException: No credentials found for account sshblobhierarchyoff.blob.core.windows.net in the configuration, and its container sshcont01 is not accessible using anonymous credentials. Please check if the container exists first. If it is not publicly available, you have to provide account credentials.
    Error [18:01:18] at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:1098)
    Error [18:01:18] at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:547)
    Error [18:01:18] at org.apache.hadoop.fs.azure.NativeAzureFilSystem.initialize(NativeAzureFileSystem.java:1379)
    Error [18:01:18] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
    Error [18:01:18] at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
    Error [18:01:18] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
    Error [18:01:18] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
    Error [18:01:18] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
    Error [18:01:18] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:288)

    But if I set "access level: container" for this container it passes well. We suppose the MSI should provide access to the container without making it public.

    Could you clarify this?

    Thanks
    Sergiy

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.