Hello @Federico Sardo
Thanks for the question and using MS Q&A platform.
Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob Storage.
Case1: Access the defualt storage account created while creating the HDInsight cluster.
There are several ways you can access the files in Data Lake Storage Gen2 from an HDInsight cluster.
- Using the fully qualified name. With this approach, you provide the full path to the file that you want to access:
abfs://<containername>@<accountname>.dfs.core.windows.net/<file.path>/
- Using the shortened path format. With this approach, you replace the path up to the cluster root with::
abfs:///<file.path>/
- Using the relative path. With this approach, you only provide the relative path to the file that you want to access:
/<file.path>/
Case2: Access the additional storage account
If you want to access the data residing on the external storage. Then you will have to add that storage as additional storage in the HDInsight cluster.
Steps to add storage accounts to the existing clusters via Ambari UI:
Step 1: From a web browser, navigate to https://CLUSTERNAME.azurehdinsight.net, where CLUSTERNAME is the name of your cluster.
Step 2: Navigate to HDFS -->Config -->Advanced, scroll down to Custom core-site
Step 3: Select Add Property and enter your storage account name and key in following manner
Key => fs.azure.account.key.(storage_account).blob.core.windows.net
Value => (Storage Access Key)
Step 4: Observe the keys that begin with fs.azure.account.key. The account name will be a part of the key as seen in this sample image:
For more details, refer to Use Azure Data Lake Storage Gen2 with Azure HDInsight clusters and Add additional storage accounts to HDInsight
Hope this helps. Do let us know if you any further queries.
Please don’t forget to Accept Answer
wherever the information provided helps you, this can be beneficial to other community members.