Use Java to manage ACLs in Azure Data Lake Storage Gen2

This article shows you how to use Java to get, set, and update the access control lists of directories and files.

ACL inheritance is already available for new child items that are created under a parent directory. But you can also add, update, and remove ACLs recursively on the existing child items of a parent directory without having to make these changes individually for each child item.

Package (Maven) | Samples | API reference | Gen1 to Gen2 mapping | Give Feedback

Prerequisites

  • An Azure subscription. See Get Azure free trial.

  • A storage account that has hierarchical namespace (HNS) enabled. Follow these instructions to create one.

  • Azure CLI version 2.6.0 or higher.

  • One of the following security permissions:

    • A provisioned Azure Active Directory (AD) security principal that has been assigned the Storage Blob Data Owner role in the scope of the either the target container, parent resource group or subscription.

    • Owning user of the target container or directory to which you plan to apply ACL settings. To set ACLs recursively, this includes all child items in the target container or directory.

    • Storage account key.

Set up your project

To get started, open this page and find the latest version of the Java library. Then, open the pom.xml file in your text editor. Add a dependency element that references that version.

If you plan to authenticate your client application by using Azure Active Directory (AD), then add a dependency to the Azure Secret Client Library. See Adding the Secret Client Library package to your project.

Next, add these imports statements to your code file.

import com.azure.storage.common.StorageSharedKeyCredential;
import com.azure.storage.file.datalake.DataLakeDirectoryClient;
import com.azure.storage.file.datalake.DataLakeFileClient;
import com.azure.storage.file.datalake.DataLakeFileSystemClient;
import com.azure.storage.file.datalake.DataLakeServiceClient;
import com.azure.storage.file.datalake.DataLakeServiceClientBuilder;
import com.azure.storage.file.datalake.models.ListPathsOptions;
import com.azure.storage.file.datalake.models.PathItem;
import com.azure.storage.file.datalake.models.AccessControlChangeCounters;
import com.azure.storage.file.datalake.models.AccessControlChangeResult;
import com.azure.storage.file.datalake.models.AccessControlType;
import com.azure.storage.file.datalake.models.PathAccessControl;
import com.azure.storage.file.datalake.models.PathAccessControlEntry;
import com.azure.storage.file.datalake.models.PathPermissions;
import com.azure.storage.file.datalake.models.PathRemoveAccessControlEntry;
import com.azure.storage.file.datalake.models.RolePermissions;
import com.azure.storage.file.datalake.options.PathSetAccessControlRecursiveOptions;

Connect to the account

To use the snippets in this article, you'll need to create a DataLakeServiceClient instance that represents the storage account.

Connect by using Azure Active Directory (Azure AD)

You can use the Azure identity client library for Java to authenticate your application with Azure AD.

Get a client ID, a client secret, and a tenant ID. To do this, see Acquire a token from Azure AD for authorizing requests from a client application. As part of that process, you'll have to assign one of the following Azure role-based access control (Azure RBAC) roles to your security principal.

Role ACL setting capability
Storage Blob Data Owner All directories and files in the account.
Storage Blob Data Contributor Only directories and files owned by the security principal.

This example creates a DataLakeServiceClient instance by using a client ID, a client secret, and a tenant ID.

static public DataLakeServiceClient GetDataLakeServiceClient
    (String accountName, String clientId, String ClientSecret, String tenantID){

    String endpoint = "https://" + accountName + ".dfs.core.windows.net";
    
    ClientSecretCredential clientSecretCredential = new ClientSecretCredentialBuilder()
    .clientId(clientId)
    .clientSecret(ClientSecret)
    .tenantId(tenantID)
    .build();
       
    DataLakeServiceClientBuilder builder = new DataLakeServiceClientBuilder();
    return builder.credential(clientSecretCredential).endpoint(endpoint).buildClient();
}

Note

For more examples, see the Azure identity client library for Java documentation.

Connect by using an account key

This is the easiest way to connect to an account.

This example creates a DataLakeServiceClient instance by using an account key.

static public DataLakeServiceClient GetDataLakeServiceClient
(String accountName, String accountKey){

    StorageSharedKeyCredential sharedKeyCredential =
        new StorageSharedKeyCredential(accountName, accountKey);

    DataLakeServiceClientBuilder builder = new DataLakeServiceClientBuilder();

    builder.credential(sharedKeyCredential);
    builder.endpoint("https://" + accountName + ".dfs.core.windows.net");

    return builder.buildClient();
}

Set ACLs

When you set an ACL, you replace the entire ACL including all of it's entries. If you want to change the permission level of a security principal or add a new security principal to the ACL without affecting other existing entries, you should update the ACL instead. To update an ACL instead of replace it, see the Update ACLs section of this article.

If you choose to set the ACL, you must add an entry for the owning user, an entry for the owning group, and an entry for all other users. To learn more about the owning user, the owning group, and all other users, see Users and identities.

This section shows you how to:

  • Set the ACL of a directory
  • Set the ACL of a file
  • Set ACLs recursively

Set the ACL of a directory

This example gets and then sets the ACL of a directory named my-directory. This example gives the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others read access.

  public void ManageDirectoryACLs(DataLakeFileSystemClient fileSystemClient){

      DataLakeDirectoryClient directoryClient =
        fileSystemClient.getDirectoryClient("");

      PathAccessControl directoryAccessControl =
          directoryClient.getAccessControl();

      List<PathAccessControlEntry> pathPermissions = directoryAccessControl.getAccessControlList();
     
      System.out.println(PathAccessControlEntry.serializeList(pathPermissions));
           
      RolePermissions groupPermission = new RolePermissions();
      groupPermission.setExecutePermission(true).setReadPermission(true);

      RolePermissions ownerPermission = new RolePermissions();
      ownerPermission.setExecutePermission(true).setReadPermission(true).setWritePermission(true);

      RolePermissions otherPermission = new RolePermissions();
      otherPermission.setReadPermission(true);

      PathPermissions permissions = new PathPermissions();

      permissions.setGroup(groupPermission);
      permissions.setOwner(ownerPermission);
      permissions.setOther(otherPermission);

      directoryClient.setPermissions(permissions, null, null);

      pathPermissions = directoryClient.getAccessControl().getAccessControlList();
   
      System.out.println(PathAccessControlEntry.serializeList(pathPermissions));

  }

You can also get and set the ACL of the root directory of a container. To get the root directory, pass an empty string ("") into the DataLakeFileSystemClient.getDirectoryClient method.

Set the ACL of a file

This example gets and then sets the ACL of a file named upload-file.txt. This example gives the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others read access.

 public void ManageFileACLs(DataLakeFileSystemClient fileSystemClient){

     DataLakeDirectoryClient directoryClient =
       fileSystemClient.getDirectoryClient("my-directory");

     DataLakeFileClient fileClient = 
       directoryClient.getFileClient("uploaded-file.txt");

     PathAccessControl fileAccessControl =
         fileClient.getAccessControl();

   List<PathAccessControlEntry> pathPermissions = fileAccessControl.getAccessControlList();
  
   System.out.println(PathAccessControlEntry.serializeList(pathPermissions));
        
   RolePermissions groupPermission = new RolePermissions();
   groupPermission.setExecutePermission(true).setReadPermission(true);

   RolePermissions ownerPermission = new RolePermissions();
   ownerPermission.setExecutePermission(true).setReadPermission(true).setWritePermission(true);

   RolePermissions otherPermission = new RolePermissions();
   otherPermission.setReadPermission(true);

   PathPermissions permissions = new PathPermissions();

   permissions.setGroup(groupPermission);
   permissions.setOwner(ownerPermission);
   permissions.setOther(otherPermission);

   fileClient.setPermissions(permissions, null, null);

   pathPermissions = fileClient.getAccessControl().getAccessControlList();

   System.out.println(PathAccessControlEntry.serializeList(pathPermissions));

 }

Set ACLs recursively

Set ACLs recursively by calling the DataLakeDirectoryClient.setAccessControlRecursive method. Pass this method a List of PathAccessControlEntry objects. Each PathAccessControlEntry defines an ACL entry.

If you want to set a default ACL entry, then you can call the setDefaultScope method of the PathAccessControlEntry and pass in a value of true.

This example sets the ACL of a directory named my-parent-directory. This method accepts a boolean parameter named isDefaultScope that specifies whether to set the default ACL. That parameter is used in each call to the setDefaultScope method of the PathAccessControlEntry. The entries of the ACL give the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others no access. The last ACL entry in this example gives a specific user with the object ID "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" read and execute permissions.

public void SetACLRecursively(DataLakeFileSystemClient fileSystemClient, Boolean isDefaultScope){
    
    DataLakeDirectoryClient directoryClient =
        fileSystemClient.getDirectoryClient("my-parent-directory");

    List<PathAccessControlEntry> pathAccessControlEntries = 
        new ArrayList<PathAccessControlEntry>();

    // Create owner entry.
    PathAccessControlEntry ownerEntry = new PathAccessControlEntry();

    RolePermissions ownerPermission = new RolePermissions();
    ownerPermission.setExecutePermission(true).setReadPermission(true).setWritePermission(true);

    ownerEntry.setDefaultScope(isDefaultScope);
    ownerEntry.setAccessControlType(AccessControlType.USER);
    ownerEntry.setPermissions(ownerPermission);

    pathAccessControlEntries.add(ownerEntry);

    // Create group entry.
    PathAccessControlEntry groupEntry = new PathAccessControlEntry();

    RolePermissions groupPermission = new RolePermissions();
    groupPermission.setExecutePermission(true).setReadPermission(true).setWritePermission(false);

    groupEntry.setDefaultScope(isDefaultScope);
    groupEntry.setAccessControlType(AccessControlType.GROUP);
    groupEntry.setPermissions(groupPermission);

    pathAccessControlEntries.add(groupEntry);

    // Create other entry.
    PathAccessControlEntry otherEntry = new PathAccessControlEntry();

    RolePermissions otherPermission = new RolePermissions();
    otherPermission.setExecutePermission(false).setReadPermission(false).setWritePermission(false);

    otherEntry.setDefaultScope(isDefaultScope);
    otherEntry.setAccessControlType(AccessControlType.OTHER);
    otherEntry.setPermissions(otherPermission);

    pathAccessControlEntries.add(otherEntry);

    // Create named user entry.
    PathAccessControlEntry userEntry = new PathAccessControlEntry();

    RolePermissions userPermission = new RolePermissions();
    userPermission.setExecutePermission(true).setReadPermission(true).setWritePermission(false);

    userEntry.setDefaultScope(isDefaultScope);
    userEntry.setAccessControlType(AccessControlType.USER);
    userEntry.setEntityId("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx");
    userEntry.setPermissions(userPermission);    
    
    pathAccessControlEntries.add(userEntry);
    
    directoryClient.setAccessControlRecursive(pathAccessControlEntries);        

}

Update ACLs

When you update an ACL, you modify the ACL instead of replacing the ACL. For example, you can add a new security principal to the ACL without affecting other security principals listed in the ACL. To replace the ACL instead of update it, see the Set ACLs section of this article.

This section shows you how to:

  • Update an ACL
  • Update ACLs recursively

Update an ACL

First, get the ACL of a directory by calling the PathAccessControl.getAccessControlList method. Copy the list of ACL entries to a new List object of type PathAccessControlListEntry. Then locate the entry that you want to update and replace it in the list. Set the ACL by calling the DataLakeDirectoryClient.setAccessControlList method.

This example updates the ACL of a directory named my-parent-directory by replacing the entry for all other users.

   public void UpdateACL(DataLakeFileSystemClient fileSystemClient, Boolean isDefaultScope){

       DataLakeDirectoryClient directoryClient =
       fileSystemClient.getDirectoryClient("my-parent-directory");

       List<PathAccessControlEntry> pathAccessControlEntries = 
           directoryClient.getAccessControl().getAccessControlList();

       int index = -1;

       for (PathAccessControlEntry pathAccessControlEntry : pathAccessControlEntries){
           
           if (pathAccessControlEntry.getAccessControlType() == AccessControlType.OTHER){
               index = pathAccessControlEntries.indexOf(pathAccessControlEntry);
               break;
           }
       }

       if (index > -1){

       PathAccessControlEntry userEntry = new PathAccessControlEntry();

       RolePermissions userPermission = new RolePermissions();
       userPermission.setExecutePermission(true).setReadPermission(true).setWritePermission(true);

       userEntry.setDefaultScope(isDefaultScope);
       userEntry.setAccessControlType(AccessControlType.OTHER);
       userEntry.setPermissions(userPermission);  
       
       pathAccessControlEntries.set(index, userEntry);
       }

       directoryClient.setAccessControlList(pathAccessControlEntries, 
           directoryClient.getAccessControl().getGroup(), 
           directoryClient.getAccessControl().getOwner());
   
   }

You can also get and set the ACL of the root directory of a container. To get the root directory, pass an empty string ("") into the DataLakeFileSystemClient.getDirectoryClient method.

Update ACLs recursively

To update an ACL recursively, create a new ACL object with the ACL entry that you want to update, and then use that object in update ACL operation. Do not get the existing ACL, just provide ACL entries to be updated.

Update ACLs recursively by calling the DataLakeDirectoryClient.updateAccessControlRecursive method. Pass this method a List of PathAccessControlEntry objects. Each PathAccessControlEntry defines an ACL entry.

If you want to update a default ACL entry, then you can the setDefaultScope method of the PathAccessControlEntry and pass in a value of true.

This example updates an ACL entry with write permission. This method accepts a boolean parameter named isDefaultScope that specifies whether to update the default ACL. That parameter is used in the call to the setDefaultScope method of the PathAccessControlEntry.

public void UpdateACLRecursively(DataLakeFileSystemClient fileSystemClient, Boolean isDefaultScope){

    DataLakeDirectoryClient directoryClient =
    fileSystemClient.getDirectoryClient("my-parent-directory");

    List<PathAccessControlEntry> pathAccessControlEntries = 
        new ArrayList<PathAccessControlEntry>();

    // Create named user entry.
    PathAccessControlEntry userEntry = new PathAccessControlEntry();

    RolePermissions userPermission = new RolePermissions();
    userPermission.setExecutePermission(true).setReadPermission(true).setWritePermission(true);

    userEntry.setDefaultScope(isDefaultScope);
    userEntry.setAccessControlType(AccessControlType.USER);
    userEntry.setEntityId("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx");
    userEntry.setPermissions(userPermission);    
    
    pathAccessControlEntries.add(userEntry);
    
    directoryClient.updateAccessControlRecursive(pathAccessControlEntries);        

}

Remove ACL entries

You can remove one or more ACL entries. This section shows you how to:

  • Remove an ACL entry
  • Remove ACL entries recursively

Remove an ACL entry

First, get the ACL of a directory by calling the PathAccessControl.getAccessControlList method. Copy the list of ACL entries to a new List object of type PathAccessControlListEntry. Then locate the entry that you want to remove and call the Remove method of the List object. Set the updated ACL by calling the DataLakeDirectoryClient.setAccessControlList method.

 public void RemoveACLEntry(DataLakeFileSystemClient fileSystemClient, Boolean isDefaultScope){

     DataLakeDirectoryClient directoryClient =
     fileSystemClient.getDirectoryClient("my-parent-directory");

     List<PathAccessControlEntry> pathAccessControlEntries = 
         directoryClient.getAccessControl().getAccessControlList();

     PathAccessControlEntry entryToRemove = null;

     for (PathAccessControlEntry pathAccessControlEntry : pathAccessControlEntries){
         
         if (pathAccessControlEntry.getEntityId() != null){

             if (pathAccessControlEntry.getEntityId().equals("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")){
                 entryToRemove = pathAccessControlEntry;
                 break;
             }
         }

     }

     if (entryToRemove != null){
         
         pathAccessControlEntries.remove(entryToRemove);

         directoryClient.setAccessControlList(pathAccessControlEntries, 
         directoryClient.getAccessControl().getGroup(), 
         directoryClient.getAccessControl().getOwner());   
     }

 
 }

Remove ACL entries recursively

To remove ACL entries recursively, create a new ACL object for ACL entry to be removed, and then use that object in remove ACL operation. Do not get the existing ACL, just provide the ACL entries to be removed.

Remove ACL entries by calling the DataLakeDirectoryClient.removeAccessControlRecursive method. Pass this method a List of PathAccessControlEntry objects. Each PathAccessControlEntry defines an ACL entry.

If you want to remove a default ACL entry, then you can the setDefaultScope method of the PathAccessControlEntry and pass in a value of true.

This example removes an ACL entry from the ACL of the directory named my-parent-directory. This method accepts a boolean parameter named isDefaultScope that specifies whether to remove the entry from the default ACL. That parameter is used in the call to the setDefaultScope method of the PathAccessControlEntry.

  public void RemoveACLEntryRecursively(DataLakeFileSystemClient fileSystemClient, Boolean isDefaultScope){

      DataLakeDirectoryClient directoryClient =
      fileSystemClient.getDirectoryClient("my-parent-directory");

      List<PathRemoveAccessControlEntry> pathRemoveAccessControlEntries = 
          new ArrayList<PathRemoveAccessControlEntry>();

      // Create named user entry.
      PathRemoveAccessControlEntry userEntry = new PathRemoveAccessControlEntry();

      RolePermissions userPermission = new RolePermissions();
      userPermission.setExecutePermission(true).setReadPermission(true).setWritePermission(true);

      userEntry.setDefaultScope(isDefaultScope);
      userEntry.setAccessControlType(AccessControlType.USER);
      userEntry.setEntityId("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"); 
      
      pathRemoveAccessControlEntries.add(userEntry);
      
      directoryClient.removeAccessControlRecursive(pathRemoveAccessControlEntries);      
  
  }

Recover from failures

You might encounter runtime or permission errors. For runtime errors, restart the process from the beginning. Permission errors can occur if the security principal doesn't have sufficient permission to modify the ACL of a directory or file that is in the directory hierarchy being modified. Address the permission issue, and then choose to either resume the process from the point of failure by using a continuation token, or restart the process from beginning. You don't have to use the continuation token if you prefer to restart from the beginning. You can reapply ACL entries without any negative impact.

This example returns a continuation token in the event of a failure. The application can call this example method again after the error has been addressed, and pass in the continuation token. If this example method is called for the first time, the application can pass in a value of null for the continuation token parameter.

public String ResumeSetACLRecursively(DataLakeFileSystemClient fileSystemClient,
    DataLakeDirectoryClient directoryClient,
    List<PathAccessControlEntry> accessControlList, 
    String continuationToken){

    try{
        PathSetAccessControlRecursiveOptions options = new PathSetAccessControlRecursiveOptions(accessControlList);
        
        options.setContinuationToken(continuationToken);
    
       Response<AccessControlChangeResult> accessControlChangeResult =  
          directoryClient.setAccessControlRecursiveWithResponse(options, null, null);

       if (accessControlChangeResult.getValue().getCounters().getFailedChangesCount() > 0)
       {
          continuationToken =
              accessControlChangeResult.getValue().getContinuationToken();
       }
    
       return continuationToken;

    }
    catch(Exception ex){
    
        System.out.println(ex.toString());
        return continuationToken;
    }


}

If you want the process to complete uninterrupted by permission errors, you can specify that.

To ensure that the process completes uninterrupted, call the setContinueOnFailure method of a PathSetAccessControlRecursiveOptions object and pass in a value of true.

This example sets ACL entries recursively. If this code encounters a permission error, it records that failure and continues execution. This example prints the number of failures to the console.

public void ContinueOnFailure(DataLakeFileSystemClient fileSystemClient,
    DataLakeDirectoryClient directoryClient,
    List<PathAccessControlEntry> accessControlList){
    
    PathSetAccessControlRecursiveOptions options = 
       new PathSetAccessControlRecursiveOptions(accessControlList);
        
    options.setContinueOnFailure(true);
    
    Response<AccessControlChangeResult> accessControlChangeResult =  
        directoryClient.setAccessControlRecursiveWithResponse(options, null, null);

    AccessControlChangeCounters counters = accessControlChangeResult.getValue().getCounters();

    System.out.println("Number of directories changes: " + 
        counters.getChangedDirectoriesCount());

    System.out.println("Number of files changed: " + 
        counters.getChangedDirectoriesCount());

    System.out.println("Number of failures: " + 
        counters.getChangedDirectoriesCount());
}

Best practices

This section provides you some best practice guidelines for setting ACLs recursively.

Handling runtime errors

A runtime error can occur for many reasons (For example: an outage or a client connectivity issue). If you encounter a runtime error, restart the recursive ACL process. ACLs can be reapplied to items without causing a negative impact.

Handling permission errors (403)

If you encounter an access control exception while running a recursive ACL process, your AD security principal might not have sufficient permission to apply an ACL to one or more of the child items in the directory hierarchy. When a permission error occurs, the process stops and a continuation token is provided. Fix the permission issue, and then use the continuation token to process the remaining dataset. The directories and files that have already been successfully processed won't have to be processed again. You can also choose to restart the recursive ACL process. ACLs can be reapplied to items without causing a negative impact.

Credentials

We recommend that you provision an Azure AD security principal that has been assigned the Storage Blob Data Owner role in the scope of the target storage account or container.

Performance

To reduce latency, we recommend that you run the recursive ACL process in an Azure Virtual Machine (VM) that is located in the same region as your storage account.

ACL limits

The maximum number of ACLs that you can apply to a directory or file is 32 access ACLs and 32 default ACLs. For more information, see Access control in Azure Data Lake Storage Gen2.

See also