Manage HDFS permissions for SQL Server 2019 Big Data Clusters

Applies to: SQL Server 2019 (15.x)

Important

The Microsoft SQL Server 2019 Big Data Clusters add-on will be retired. Support for SQL Server 2019 Big Data Clusters will end on February 28, 2025. All existing users of SQL Server 2019 with Software Assurance will be fully supported on the platform and the software will continue to be maintained through SQL Server cumulative updates until that time. For more information, see the announcement blog post and Big data options on the Microsoft SQL Server platform.

HDFS as a file system is similar to the Linux based file systems that use POSIX for file permissions. In addition to the traditional POSIX permissions model, HDFS also supports POSIX access control lists (ACL). For more information, see the Apache Hadoop article about ACLs.

The following sections provide examples of how to use the Azure Data CLI (azdata) to for managing HDFS file and directory permissions.

Prerequisites

HDFS shell

The hdfs shell capability in Azure Data CLI (azdata) allows you to issue commands directly in a shell to manage HDFS permissions on files and directories. The underlying mechanism uses WebHdfs calls to issue the commands

The following command will open the shell.

azdata bdc hdfs shell

To access the help for hdfs shell and understand how to issue commands, run the following command once the shell is active.

[hdfs] ?

The following example shows how to create a directory, list directories and modify permissions on a directory and give a named user bob read, write, and execute access to directory sales.

[hdfs] mkdir sales
[hdfs] ls
rwxr-xr-x  hdfs bdcadmins        0 Oct 09 18:02 system/
rwxrwxr-x admin bdcadmins        0 Oct 10 16:47 sales/
--xrwxrwxrwx  hdfs bdcadmins        0 Oct 09 18:03 tmp/
rwxrwxrwx  hdfs bdcadmins        0 Oct 09 17:59 user/

[hdfs] acl modify  '/sales/' 'user:bob:rwx'
acl modify: Change completed.
[hdfs] acl status  '/sales/'
{
  `AclStatus`: {
    `entries`: [
      `user:bob:rwx`,
      `group::r-x`
    ],
    `group`: `bdcadmins`,
    `owner`: `admin`,
    `permission`: `775`,
    `stickyBit`: false
  }
}

Create a directory in HDFS using Azure Data CLI (azdata)

Create a directory called data in path /sales.

azdata bdc hdfs mkdir --path '/sales/data'

Change owner of a directory or file

Change the owning user of directory data in HDFS and make alice the owning user and salesgroup the owning group. In order to change owner, you have to be an owner.

azdata bdc hdfs chown --owner alice --group 'salesgroup' --path '/sales/data'

Change permissions of a file or directory with chmod

Use chmod to change permissions on files and directories (for owner, owning group, and others). For more information, see changing permissions on a Linux file system. In hdfs, the pattern is the same. For example:

azdata bdc hdfs chmod --permission 750 --path /sales/data
azdata bdc hdfs chmod --permission 775 --path /sales/data/file.txt

Set sticky bit on directories

Set the sticky bit can on directories to prevent unintentional file deletion or relocation. The sticky bit limits the permission to delete or move a file to the superuser, directory owner, or file owner. This setting does not affect the file. The below example sets a sticky bit on directory users by prefixing the permissions with a 1.

azdata bdc hdfs chmod --path /sales/users --permission 1750

Set ACLs on files and directories

To set ACLs on files and directories in HDFS, use the Azure Data CLI (azdata) commands.

Setting ACLs on a directory and giving named user tom read, write and execute access to directory data.

Note

When using the set command, make sure you are providing the full ACL spec including ACL spec for owning user, owning group and others.

azdata bdc hdfs acl set --path '/sales' --aclspec  'user::rw-,user:tom:rwx,group::rw-,other::rw-'

Default ACL on directories

Default ACL enables sub-directories to inherit permissions from the parent directory. Only directories can have default ACL. When a new file or sub-directory is created, it automatically inherits the default ACL of its parent into its own access ACL. In this way, the default ACL will be inherited down through arbitrarily deep directory levels as new sub-directories are created.

Below is an example of how to set default ACL using azdata.

azdata bdc hdfs acl set --path '/sale' --aclspec  'user::rw-,user:tom:rwx,group::rw-,other::rw-,default:group::rw-,default:user::rw-,default:other::rw-'