Use .NET to manage ACLs in Azure Data Lake Storage Gen2

This article shows you how to use .NET to get, set, and update the access control lists of directories and files.

ACL inheritance is already available for new child items that are created under a parent directory. But you can also add, update, and remove ACLs recursively on the existing child items of a parent directory without having to make these changes individually for each child item.

Package (NuGet) | Samples | API reference | Gen1 to Gen2 mapping | Give Feedback

Prerequisites

  • An Azure subscription. See Get Azure free trial.

  • A storage account that has hierarchical namespace (HNS) enabled. Follow these instructions to create one.

  • Azure CLI version 2.6.0 or higher.

  • One of the following security permissions:

    • A provisioned Microsoft Entra ID security principal that has been assigned the Storage Blob Data Owner role, scoped to the target container, storage account, parent resource group, or subscription.

    • Owning user of the target container or directory to which you plan to apply ACL settings. To set ACLs recursively, this includes all child items in the target container or directory.

    • Storage account key.

Set up your project

To get started, install the Azure.Storage.Files.DataLake NuGet package.

  1. Open a command window (For example: Windows PowerShell).

  2. From your project directory, install the Azure.Storage.Files.DataLake preview package by using the dotnet add package command.

    dotnet add package Azure.Storage.Files.DataLake -v 12.6.0 -s https://pkgs.dev.azure.com/azure-sdk/public/_packaging/azure-sdk-for-net/nuget/v3/index.json
    

    Then, add these using statements to the top of your code file.

    using Azure;
    using Azure.Core;
    using Azure.Storage;
    using Azure.Storage.Files.DataLake;
    using Azure.Storage.Files.DataLake.Models;
    using System.Collections.Generic;
    using System.Threading.Tasks;
    

Connect to the account

To use the snippets in this article, you'll need to create a DataLakeServiceClient instance that represents the storage account.

Connect by using Microsoft Entra ID

Note

If you're using Microsoft Entra ID to authorize access, then make sure that your security principal has been assigned the Storage Blob Data Owner role. To learn more about how ACL permissions are applied and the effects of changing them, see Access control model in Azure Data Lake Storage Gen2.

You can use the Azure identity client library for .NET to authenticate your application with Microsoft Entra ID.

After you install the package, add this using statement to the top of your code file.

using Azure.Identity;

First, you'll have to assign one of the following Azure role-based access control (Azure RBAC) roles to your security principal:

Role ACL setting capability
Storage Blob Data Owner All directories and files in the account.
Storage Blob Data Contributor Only directories and files owned by the security principal.

Next, create a DataLakeServiceClient instance and pass in a new instance of the DefaultAzureCredential class.

public static DataLakeServiceClient GetDataLakeServiceClient(string accountName)
{
    string dfsUri = $"https://{accountName}.dfs.core.windows.net";

    DataLakeServiceClient dataLakeServiceClient = new DataLakeServiceClient(
        new Uri(dfsUri),
        new DefaultAzureCredential());

    return dataLakeServiceClient;
}

To learn more about using DefaultAzureCredential to authorize access to data, see How to authenticate .NET applications with Azure services.

Connect by using an account key

You can authorize access to data using your account access keys (Shared Key). This example creates a DataLakeServiceClient instance that is authorized with the account key.

public static DataLakeServiceClient GetDataLakeServiceClient(string accountName, string accountKey)
{
    StorageSharedKeyCredential sharedKeyCredential =
        new StorageSharedKeyCredential(accountName, accountKey);

    string dfsUri = $"https://{accountName}.dfs.core.windows.net";

    DataLakeServiceClient dataLakeServiceClient = new DataLakeServiceClient(
        new Uri(dfsUri),
        sharedKeyCredential);

    return dataLakeServiceClient;
}

Caution

Authorization with Shared Key is not recommended as it may be less secure. For optimal security, disable authorization via Shared Key for your storage account, as described in Prevent Shared Key authorization for an Azure Storage account.

Use of access keys and connection strings should be limited to initial proof of concept apps or development prototypes that don't access production or sensitive data. Otherwise, the token-based authentication classes available in the Azure SDK should always be preferred when authenticating to Azure resources.

Microsoft recommends that clients use either Microsoft Entra ID or a shared access signature (SAS) to authorize access to data in Azure Storage. For more information, see Authorize operations for data access.

Set ACLs

When you set an ACL, you replace the entire ACL including all of its entries. If you want to change the permission level of a security principal or add a new security principal to the ACL without affecting other existing entries, you should update the ACL instead. To update an ACL instead of replace it, see the Update ACLs section of this article.

If you choose to set the ACL, you must add an entry for the owning user, an entry for the owning group, and an entry for all other users. To learn more about the owning user, the owning group, and all other users, see Users and identities.

This section shows you how to:

  • Set the ACL of a directory
  • Set the ACL of a file
  • Set ACLs recursively

Set the ACL of a directory

Get the access control list (ACL) of a directory by calling the DataLakeDirectoryClient.GetAccessControlAsync method and set the ACL by calling the DataLakeDirectoryClient.SetAccessControlList method.

This example gets and sets the ACL of a directory named my-directory. The string user::rwx,group::r-x,other::rw- gives the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others read and write permission.

public async Task ManageDirectoryACLs(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
      fileSystemClient.GetDirectoryClient("");

    PathAccessControl directoryAccessControl =
        await directoryClient.GetAccessControlAsync();

    foreach (var item in directoryAccessControl.AccessControlList)
    {
        Console.WriteLine(item.ToString());
    }

    IList<PathAccessControlItem> accessControlList
        = PathAccessControlExtensions.ParseAccessControlList
        ("user::rwx,group::r-x,other::rw-");

    directoryClient.SetAccessControlList(accessControlList);

}

You can also get and set the ACL of the root directory of a container. To get the root directory, pass an empty string ("") into the DataLakeFileSystemClient.GetDirectoryClient method.

Set the ACL of a file

Get the access control list (ACL) of a file by calling the DataLakeFileClient.GetAccessControlAsync method and set the ACL by calling the DataLakeFileClient.SetAccessControlList method.

This example gets and sets the ACL of a file named my-file.txt. The string user::rwx,group::r-x,other::rw- gives the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others read and write permission.

public async Task ManageFileACLs(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
        fileSystemClient.GetDirectoryClient("my-directory");

    DataLakeFileClient fileClient =
        directoryClient.GetFileClient("hello.txt");

    PathAccessControl FileAccessControl =
        await fileClient.GetAccessControlAsync();

    foreach (var item in FileAccessControl.AccessControlList)
    {
        Console.WriteLine(item.ToString());
    }

    IList<PathAccessControlItem> accessControlList
        = PathAccessControlExtensions.ParseAccessControlList
        ("user::rwx,group::r-x,other::rw-");

    fileClient.SetAccessControlList(accessControlList);
}

Set ACLs recursively

Set ACLs recursively by calling the DataLakeDirectoryClient.SetAccessControlRecursiveAsync method. Pass this method a List of PathAccessControlItem. Each PathAccessControlItem defines an ACL entry.

If you want to set a default ACL entry, then you can set the PathAccessControlItem.DefaultScope property of the PathAccessControlItem to true.

This example sets the ACL of a directory named my-parent-directory. This method accepts a boolean parameter named isDefaultScope that specifies whether to set the default ACL. That parameter is used in the constructor of the PathAccessControlItem. The entries of the ACL give the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others no access. The last ACL entry in this example gives a specific user with the object ID xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx read and execute permissions.

    public async Task SetACLRecursively(DataLakeServiceClient serviceClient, bool isDefaultScope)
{
    DataLakeDirectoryClient directoryClient =
        serviceClient.GetFileSystemClient("my-container").
            GetDirectoryClient("my-parent-directory");

    List<PathAccessControlItem> accessControlList =
        new List<PathAccessControlItem>()
    {
new PathAccessControlItem(AccessControlType.User,
    RolePermissions.Read |
    RolePermissions.Write |
    RolePermissions.Execute, isDefaultScope),

new PathAccessControlItem(AccessControlType.Group,
    RolePermissions.Read |
    RolePermissions.Execute, isDefaultScope),

new PathAccessControlItem(AccessControlType.Other,
    RolePermissions.None, isDefaultScope),

new PathAccessControlItem(AccessControlType.User,
    RolePermissions.Read |
    RolePermissions.Execute, isDefaultScope,
    entityId: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"),
    };

    await directoryClient.SetAccessControlRecursiveAsync
        (accessControlList, null);
}

Update ACLs

When you update an ACL, you modify the ACL instead of replacing the ACL. For example, you can add a new security principal to the ACL without affecting other security principals listed in the ACL. To replace the ACL instead of update it, see the Set ACLs section of this article.

This section shows you how to:

  • Update an ACL
  • Update ACLs recursively

Update an ACL

First, get the ACL of a directory by calling the DataLakeDirectoryClient.GetAccessControlAsync method. Copy the list of ACL entries to a new List of PathAccessControl objects. Then locate the entry that you want to update and replace it in the list. Set the ACL by calling the DataLakeDirectoryClient.SetAccessControlList method.

This example updates the root ACL of a container by replacing the ACL entry for all other users.

public async Task UpdateDirectoryACLs(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
      fileSystemClient.GetDirectoryClient("");

    PathAccessControl directoryAccessControl =
        await directoryClient.GetAccessControlAsync();

    List<PathAccessControlItem> accessControlListUpdate 
        = (List<PathAccessControlItem>)directoryAccessControl.AccessControlList;

    int index = -1;

    foreach (var item in accessControlListUpdate)
    {
        if (item.AccessControlType == AccessControlType.Other)
        {
            index = accessControlListUpdate.IndexOf(item);
            break;
        }
    }

    if (index > -1)
    {
        accessControlListUpdate[index] = new PathAccessControlItem(AccessControlType.Other,
        RolePermissions.Read |
        RolePermissions.Execute);

        directoryClient.SetAccessControlList(accessControlListUpdate);
    }

   }

Update ACLs recursively

To update an ACL recursively, create a new ACL object with the ACL entry that you want to update, and then use that object in update ACL operation. Do not get the existing ACL, just provide ACL entries to be updated.

Update an ACL recursively by calling the DataLakeDirectoryClient.UpdateAccessControlRecursiveAsync method. Pass this method a List of PathAccessControlItem. Each PathAccessControlItem defines an ACL entry.

If you want to update a default ACL entry, then you can set the PathAccessControlItem.DefaultScope property of the PathAccessControlItem to true.

This example updates an ACL entry with write permission. This method accepts a boolean parameter named isDefaultScope that specifies whether to update the default ACL. That parameter is used in the constructor of the PathAccessControlItem.

public async Task UpdateACLsRecursively(DataLakeServiceClient serviceClient, bool isDefaultScope)
{
    DataLakeDirectoryClient directoryClient =
        serviceClient.GetFileSystemClient("my-container").
        GetDirectoryClient("my-parent-directory");

    List<PathAccessControlItem> accessControlListUpdate =
        new List<PathAccessControlItem>()
    {
new PathAccessControlItem(AccessControlType.User,
    RolePermissions.Read |
    RolePermissions.Write |
    RolePermissions.Execute, isDefaultScope,
    entityId: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"),
    };

    await directoryClient.UpdateAccessControlRecursiveAsync
        (accessControlListUpdate, null);

}

Remove ACL entries

You can remove one or more ACL entries. This section shows you how to:

  • Remove an ACL entry
  • Remove ACL entries recursively

Remove an ACL entry

First, get the ACL of a directory by calling the DataLakeDirectoryClient.GetAccessControlAsync method. Copy the list of ACL entries to a new List of PathAccessControl objects. Then locate the entry that you want to remove and call the Remove method of the collection. Set the updated ACL by calling the DataLakeDirectoryClient.SetAccessControlList method.

This example updates the root ACL of a container by replacing the ACL entry for all other users.

public async Task RemoveDirectoryACLEntry
    (DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
      fileSystemClient.GetDirectoryClient("");

    PathAccessControl directoryAccessControl =
        await directoryClient.GetAccessControlAsync();

    List<PathAccessControlItem> accessControlListUpdate
        = (List<PathAccessControlItem>)directoryAccessControl.AccessControlList;

    PathAccessControlItem entryToRemove = null;

    foreach (var item in accessControlListUpdate)
    {
        if (item.EntityId == "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
        {
            entryToRemove = item;
            break;
        }
    }

    if (entryToRemove != null)
    {
        accessControlListUpdate.Remove(entryToRemove);
        directoryClient.SetAccessControlList(accessControlListUpdate);
    }

}

Remove ACL entries recursively

To remove ACL entries recursively, create a new ACL object for ACL entry to be removed, and then use that object in remove ACL operation. Do not get the existing ACL, just provide the ACL entries to be removed.

Remove ACL entries by calling the DataLakeDirectoryClient.RemoveAccessControlRecursiveAsync method. Pass this method a List of PathAccessControlItem. Each PathAccessControlItem defines an ACL entry.

If you want to remove a default ACL entry, then you can set the PathAccessControlItem.DefaultScope property of the PathAccessControlItem to true.

This example removes an ACL entry from the ACL of the directory named my-parent-directory. This method accepts a boolean parameter named isDefaultScope that specifies whether to remove the entry from the default ACL. That parameter is used in the constructor of the PathAccessControlItem.

public async Task RemoveACLsRecursively(DataLakeServiceClient serviceClient, bool isDefaultScope)
{
    DataLakeDirectoryClient directoryClient =
        serviceClient.GetFileSystemClient("my-container").
            GetDirectoryClient("my-parent-directory");

    List<RemovePathAccessControlItem> accessControlListForRemoval =
        new List<RemovePathAccessControlItem>()
        {
    new RemovePathAccessControlItem(AccessControlType.User, isDefaultScope,
    entityId: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"),
        };

    await directoryClient.RemoveAccessControlRecursiveAsync
        (accessControlListForRemoval, null);

}

Recover from failures

You might encounter runtime or permission errors when modifying ACLs recursively. For runtime errors, restart the process from the beginning. Permission errors can occur if the security principal doesn't have sufficient permission to modify the ACL of a directory or file that is in the directory hierarchy being modified. Address the permission issue, and then choose to either resume the process from the point of failure by using a continuation token, or restart the process from beginning. You don't have to use the continuation token if you prefer to restart from the beginning. You can reapply ACL entries without any negative impact.

This example returns a continuation token in the event of a failure. The application can call this example method again after the error has been addressed, and pass in the continuation token. If this example method is called for the first time, the application can pass in a value of null for the continuation token parameter.

public async Task<string> ResumeAsync(DataLakeServiceClient serviceClient,
    DataLakeDirectoryClient directoryClient,
    List<PathAccessControlItem> accessControlList,
    string continuationToken)
{
    try
    {
        var accessControlChangeResult =
            await directoryClient.SetAccessControlRecursiveAsync(
                accessControlList, continuationToken: continuationToken, null);

        if (accessControlChangeResult.Value.Counters.FailedChangesCount > 0)
        {
            continuationToken =
                accessControlChangeResult.Value.ContinuationToken;
        }

        return continuationToken;
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.ToString());
        return continuationToken;
    }

}

If you want the process to complete uninterrupted by permission errors, you can specify that.

To ensure that the process completes uninterrupted, pass in an AccessControlChangedOptions object and set the ContinueOnFailure property of that object to true.

This example sets ACL entries recursively. If this code encounters a permission error, it records that failure and continues execution. This example prints the number of failures to the console.

public async Task ContinueOnFailureAsync(DataLakeServiceClient serviceClient,
    DataLakeDirectoryClient directoryClient,
    List<PathAccessControlItem> accessControlList)
{
    var accessControlChangeResult =
        await directoryClient.SetAccessControlRecursiveAsync(
            accessControlList, null, new AccessControlChangeOptions()
            { ContinueOnFailure = true });

    var counters = accessControlChangeResult.Value.Counters;

    Console.WriteLine("Number of directories changed: " +
        counters.ChangedDirectoriesCount.ToString());

    Console.WriteLine("Number of files changed: " +
        counters.ChangedFilesCount.ToString());

    Console.WriteLine("Number of failures: " +
        counters.FailedChangesCount.ToString());
}

Best practices

This section provides you some best practice guidelines for setting ACLs recursively.

Handling runtime errors

A runtime error can occur for many reasons (For example: an outage or a client connectivity issue). If you encounter a runtime error, restart the recursive ACL process. ACLs can be reapplied to items without causing a negative impact.

Handling permission errors (403)

If you encounter an access control exception while running a recursive ACL process, your AD security principal might not have sufficient permission to apply an ACL to one or more of the child items in the directory hierarchy. When a permission error occurs, the process stops and a continuation token is provided. Fix the permission issue, and then use the continuation token to process the remaining dataset. The directories and files that have already been successfully processed won't have to be processed again. You can also choose to restart the recursive ACL process. ACLs can be reapplied to items without causing a negative impact.

Credentials

We recommend that you provision a Microsoft Entra security principal that has been assigned the Storage Blob Data Owner role in the scope of the target storage account or container.

Performance

To reduce latency, we recommend that you run the recursive ACL process in an Azure Virtual Machine (VM) that is located in the same region as your storage account.

ACL limits

The maximum number of ACLs that you can apply to a directory or file is 32 access ACLs and 32 default ACLs. For more information, see Access control in Azure Data Lake Storage Gen2.

See also