Share via


Create and manage Postgres roles for Azure Databricks identities

Important

This feature is in Public Preview in the following regions: westus, westus2, eastus, eastus2, centralus, southcentralus, northeurope, westeurope, australiaeast, brazilsouth, canadacentral, centralindia, southeastasia, uksouth.

A Postgres role for the Lakebase database instance owner’s Azure Databricks identity is created automatically. Initially, only the owner of the instance can log in and access the instance through Postgres. To allow other Azure Databricks identities to log in to the database instance, the owner needs to create corresponding Postgres roles.

Initially, only the owner of the instance can log in and access the instance through Postgres. To allow other Azure Databricks identities to log in to the database instance, the Azure Databricks identity must have a corresponding Postgres role.

This page describes how to add and manage additional Azure Databricks identity-based roles in PostgreSQL.

Create Postgres roles and grant privileges for Azure Databricks identities

Create Postgres roles using the Azure Databricks UI or PostgreSQL queries.

Note

Role management actions are governed by the permissions granted on the database instance. Ensure you have the appropriate level of access before attempting to manage roles.

UI

Users with CAN USE permission on the database instance can view the existing Postgres roles associated with Azure Databricks identities or add a role for their own identity to the instance.

Users with CAN MANAGE can additionally create roles for other Azure Databricks identities, including with additional privileges, and drop roles for any Azure Databricks identity.

You can assign additional permissions to any desired users, groups, or service principals in the Database instances overview page.

  1. Click Compute in the workspace sidebar.

  2. Click Database instances.

  3. Click the Permissions tab.

  4. Click Add PostgreSQL role in the upper-right side.

  5. For Workspace identity, enter a user, group, or service principal and select the Azure Databricks identity. You must select a Azure Databricks identity that doesn’t already have a Postgres role in the instance.

  6. Select a Role membership. If you have CAN MANAGE permission on the database instance, you can add membership to the databricks_superuser role and enable some role attributes.

  7. Select which PostgreSQL attributes to grant to the new role.

    • CREATEDB: grants permission to create new databases
    • CREATEROLE: grants permission to create new roles
    • BYPASS RLS: grants permission to bypass all row-level security in the instance
  8. Click Confirm.

PostgreSQL

Before creating new Postgres roles, verify that you meet the following requirements:

  • You must have the CREATE and CREATE ROLE permissions on the database.
  • You must authenticate and log in to Postgres as a Azure Databricks identity (user, service principal, or group). Native Postgres authenticated sessions cannot create Azure Databricks roles.
  • Your authentication token must be valid and not expired at the time of role creation.

Use the databricks_create_role function to add and create Azure Databricks identity-based PostgreSQL roles. The custom PostgreSQL extension databricks_auth provides the databricks_create_role function.

  1. Create the databricks_auth extension. Each Postgres database must have its own extension.

    CREATE EXTENSION IF NOT EXISTS databricks_auth;
    
  2. Use the databricks_create_role function to add and create new Postgres roles for Azure Databricks identities. The role must not already exist. If a role with the same name exists, drop it before creating the Azure Databricks identity-based role.

    SELECT databricks_create_role('identity_name', 'identity_type');
    

    The identity_name and identity_type parameters depend on the Azure Databricks identity type:

    • Azure Databricks User:

      • identity_name: Email of the user e.g. myuser@databricks.com
      • identity_type: USER
      SELECT databricks_create_role('myuser@databricks.com','USER');
      
    • Azure Databricks Service Principal:

      • identity_name: Application ID of Service Principal e.g. 8c01cfb1-62c9-4a09-88a8-e195f4b01b08
      • identity_type: SERVICE_PRINCIPAL
      SELECT databricks_create_role('8c01cfb1-62c9-4a09-88a8-e195f4b01b08','SERVICE_PRINCIPAL');
      
    • Azure Databricks Group:

      • identity_name: Name of the group (case sensitive): My Group 123
      • identity_type: GROUP
      SELECT databricks_create_role('My Group 123','GROUP');
      
  3. A role created using databricks_create_role only has privileges granted to PUBLIC after creation. To grant or revoke additional privileges, use the standard Postgres privilege management commands GRANT and REVOKE.

    Give the role read permission to access a table.

    GRANT SELECT ON "my_schema"."my_table" TO <role-name>;
    

    Remove write access to a table from the role.

    REVOKE INSERT, UPDATE, DELETE ON TABLE "my_schema"."my_table" FROM <role-name>;
    

    Revoke all access to a database from the role.

    REVOKE CONNECT ON DATABASE "example_database" FROM <role-name>;
    

View Azure Databricks identity roles

UI

You can see which users, groups, and service principals has a corresponding Postgres role in the Database instances overview page.

  1. Click Compute in the workspace sidebar.
  2. Click Database instances.
  3. Click the Permissions tab.

PostgreSQL

Use PostgreSQL queries to list all the Azure Databricks identity roles created by default and from using the databricks_create_role function, use the databricks_list_roles function in the databricks_auth extension. This lists all Azure Databricks users, service principals, and groups added to authenticate as Postgres roles.

CREATE EXTENSION IF NOT EXISTS databricks_auth;
SELECT * from databricks_list_roles;

Use PostgreSQL queries to check Postgres permissions

To read from or write to a table in Postgres, users require the following permissions:

  • CONNECT on the database
  • USAGE on the schema
  • USAGE on the table

Run the following SQL statement to create a function named pg_temp.check_permissions that checks a user's permissions, including inherited permissions.

CREATE OR REPLACE FUNCTION pg_temp.check_permissions(TEXT, TEXT, TEXT, TEXT)
RETURNS TABLE(database_connect BOOLEAN, schema_usage BOOLEAN, table_select BOOLEAN,
              table_insert BOOLEAN, table_update BOOLEAN, table_delete BOOLEAN) AS $$
SELECT
  has_database_privilege($1, $2, 'CONNECT'),
  has_schema_privilege($1, $3, 'USAGE'),
  has_table_privilege($1, $4, 'SELECT'),
  has_table_privilege($1, $4, 'INSERT'),
  has_table_privilege($1, $4, 'UPDATE'),
  has_table_privilege($1, $4, 'DELETE')
$$ LANGUAGE sql
;

To use the function, substitute a Azure Databricks username for <your_user>, your instance name for databricks_postgres, and your table name for <my_table> in the following query:

SELECT * FROM pg_temp.check_permissions('<your_user>', 'databricks_postgres', 'public', '<my_table>');

The following SQL statement creates a function named pg_temp.make_owner_acl that returns a detailed view of all granted permissions, excluding inherited permissions, for all users:

CREATE OR REPLACE FUNCTION pg_temp.make_owner_acl(owner_oid OID)
RETURNS TABLE(grantor OID, grantee OID, privilege_type TEXT, is_grantable BOOLEAN) AS $$
SELECT owner_oid, owner_oid,'OWNER'::TEXT, True
$$ LANGUAGE SQL;

CREATE OR REPLACE FUNCTION pg_temp.get_all_permissions(TEXT, TEXT, TEXT)
RETURNS TABLE(type TEXT, name TEXT, role TEXT, acl TEXT, can_grant TEXT) AS $$
SELECT type, name,
  CASE WHEN grantee = 0 THEN 'PUBLIC' ELSE pg_get_userbyid(grantee) END AS role, privilege_type,
  CASE WHEN privilege_type = 'OWNER' THEN 'YES, ALL' WHEN is_grantable THEN 'YES' ELSE 'NO' END AS can_grant
FROM (
  SELECT 'DATABASE' type, datname as name, (pg_temp.make_owner_acl(datdba)).*
    FROM pg_database
    WHERE datname = $1
  UNION ALL
  SELECT 'DATABASE' type, datname as name, (aclexplode(datacl)).*
    FROM pg_database
    WHERE datname = $1
  UNION ALL
  SELECT 'SCHEMA' type, nspname as name, (pg_temp.make_owner_acl(nspowner)).*
    FROM pg_namespace
    WHERE nspname = $2
  UNION ALL
  SELECT 'SCHEMA' type, nspname as name, (aclexplode(nspacl)).*
    FROM pg_namespace
    WHERE nspname = $2
  UNION ALL
  SELECT 'TABLE' type, relname as name, (pg_temp.make_owner_acl(relowner)).*
    FROM pg_class
    WHERE relname = $3
  UNION ALL
  SELECT 'TABLE' type, relname as name, (aclexplode(relacl)).*
    FROM pg_class
    WHERE relname = $3
  )
$$ LANGUAGE SQL
;

To use the function, substitute your instance name for databricks_postgres and your table name for <my_table> in the following query:

SELECT * FROM pg_temp.get_all_permissions('databricks_postgres', 'public', '<mytable>');

In Postgres, a role can be a member of another role, and the membership specifies whether the permissions are inherited from the parent role. To see all the roles that a certain role is part of, use the following SQL statement to create the SQL function pg_temp.get_inherited_roles:

CREATE OR REPLACE FUNCTION pg_temp.get_inherited_roles(
  role_name TEXT
)
RETURNS TABLE(inherited_roles TEXT, member_via TEXT, inherits_permissions TEXT) AS $$
  WITH RECURSIVE role_tree AS (
    SELECT
        m.roleid,
        pg_get_userbyid(m.roleid) rolname,
         'DIRECT' COLLATE "C" as member_via,
        m.inherit_option as inherits_permissions
    FROM pg_auth_members m
    WHERE m.member = (SELECT oid FROM pg_roles WHERE rolname = $1)
    UNION ALL
    SELECT
        m.roleid,
        pg_get_userbyid(m.roleid) rolname,
        rt.rolname::text as member_via,
        (rt.inherits_permissions AND m.inherit_option) as inherits_permissions
    FROM pg_auth_members m
    JOIN role_tree rt ON m.member = rt.roleid
  )
  SELECT
      rolname AS inherited_roles,
      member_via,
      CASE WHEN inherits_permissions THEN 'YES' ELSE 'NO' END as inherits_permissions
  FROM role_tree
  GROUP BY inherited_roles, member_via, inherits_permissions
  ORDER BY inherits_permissions DESC
$$ LANGUAGE sql
;

To use the function, substitute a Azure Databricks username for <your_user>:

SELECT * FROM pg_temp.get_inherited_roles('<your role>');

To see the admin of a role, use the following SQL query and substitute the role name for <target_role>:

SELECT pg_get_userbyid(m.member) admin
FROM pg_auth_members m
WHERE m.roleid = (SELECT oid FROM pg_roles WHERE rolname = '<target_role>')
  AND m.admin_option = true;

Authenticate as Azure Databricks identity-based Postgres role

After a Postgres role is created for a Azure Databricks identity, it can be used to login to Postgres based on the identity type.

Authenticate as Azure Databricks user or service principal

Authenticating to Postgres with a Azure Databricks user or service principal is the same as authenticating to Postgres with an OAuth token.

Authenticate as Azure Databricks group

Groups and group memberships are not synced from Azure Databricks into Postgres and neither are Unity Catalog permissions. However, after a Azure Databricks group is added into Postgres, any Azure Databricks user in the group can log in as the group using the user's password. This allows you to manage permissions at the group level in Postgres. Any direct or indirect member (user or service principal) of the Azure Databricks group identity can authenticate to Postgres and log in as the Azure Databricks group Postgres role.

When authenticating as a group identity using a user or service principal token, the group membership is only validated at the time of authentication. Any previously open connection with a group member token will not be closed if the member is removed from the group after authentication. Any new connection request from a removed group member will be rejected during authentication.

export PGPASSWORD='<OAuth token of a group member>'
export GROUPROLENAME = <pg-case-sensitive-group-role-name>

psql -h $HOSTNAME -p 5432 -d databricks_postgres -U $GROUPROLENAME

Only groups assigned to the Azure Databricks workspace of the database instance are supported for group-based Postgres login. To learn how to assign a group to a workspace, see Assign a group to a workspace.

Drop a Azure Databricks identity-based Postgres role

UI

Dropping a role cannot be undone. You can recreate a role, but any object ownership reassignment is non-reversible without dropping the new role that owns reassigned objects.

  1. Click Compute in the workspace sidebar.
  2. Click Database instances.
  3. Click the Permissions tab.
  4. For the role identity you want to drop, on the rightmost side, click Kebab menu icon..
  5. Click Drop role.
  6. If you need to drop a role that owns objects, turn on Reassign owned objects. This will reassign all reassignable owned objects (databases, schemas and tables) to the other role and then drop any non-reassignable owned objects.
  7. Click Confirm.

PostgreSQL

A Azure Databricks identity-based Postgres role can be dropped and removed the same as any other Postgres role. For more details, see the PostgreSQL documentation on dropping roles. After a Azure Databricks identity-based role is dropped, it cannot be used for token-based authentication and accessing Postgres.