Kumari s - Thanks for the question and using MS Q&A platform.
Azure Data Lake Storage Gen2 supports the following authorization mechanisms:
- Shared Key authorization
- Shared access signature (SAS) authorization
- Role-based access control (Azure RBAC)
- Attribute-based access control (Azure ABAC)
- Access control lists (ACL)
For more details, refer to Access control model in Azure Data Lake Storage Gen2.
Yes, it is possible to achieve granular access control for individual users or roles at the container level in Azure Data Lake Gen2 while still using a service principal to access the storage accounts from Databricks.
One approach is to use Azure ADLS Gen2 Access Control Lists (ACLs) to grant access at the container level. ACLs allow you to specify permissions for individual users or roles on specific files and folders within a container. You can create a service principal and grant it the necessary permissions to access the storage account, and then use the ACLs to grant access to specific containers, files, or folders within the storage account. This way, you can control who has access to each dataset and ensure that access is granted only to the necessary users or roles.
Here are the general steps to achieve this:
- Create a service principal in Azure Active Directory and grant it the necessary permissions to access the storage account.
- Create an Azure Data Lake Storage Gen2 account and set up containers and folders to organize the datasets.
- Use the Azure portal or Azure CLI to set up Access Control Lists (ACLs) for the containers and folders, granting access to individual users or roles.
- In the Databricks workspace, create a secret scope for the service principal credentials.
- Use the secret scope to access the service principal credentials in your notebooks and jobs.
- Use the DButils.fs library in Databricks to read and write data to the Data Lake Gen2 account.
By using this approach, you can ensure that each user or role only has access to the specific containers, files, or folders they need, while still allowing the Databricks workspace to access the data via the service principal.
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.