Applies to: Databricks SQL Databricks Runtime Unity Catalog only
Unity Catalog and the built-in Azure Databricks Hive metastore use default locations for managed tables. Unity Catalog introduces several new securable objects to grant privileges to data in cloud object storage.
An external location is a securable object that combines a storage path with a storage credential that authorizes access to that path.
An external location’s creator is its initial owner. An external location’s owner and users with the MANAGE privilege can modify the external location’s name, URI, and storage credential.
After an external location is created, you can grant access to it to account-level principals (users and groups).
A user or group with permission to use an external location can access any storage path within the location’s path without direct access to the storage credential.
To further refine access control you can use GRANT on external tables to encapsulate access to individual files within an external location.
The storage path of any external location cannot be contained within another external location’s storage path, or within an external table’s storage path using an explicit storage credential.
Warning
If a schema (database) is registered in your workspace-level Hive metastore, dropping that schema using the CASCADE option causes all files in that schema location to be deleted recursively, regardless of the table type (managed or external).
If the schema is registered to a Unity Catalog metastore, the files for Unity Catalog managed tables are deleted recursively. However, the files for external tables are not deleted. You must manage those files using the cloud storage provider directly.
Therefore, to avoid accidental data loss, you should never register a schema in a Hive metastore to a location with existing data. Nor should you create new external tables in a location managed by Hive metastore schemas or containing Unity Catalog managed tables.
Graphical Representation of relationships
The following diagram describes the relationship between:
storage credentials
external locations
external tables
storage paths
IAM entities
Azure service accounts
Examples
SQL
-- Grant `finance` user permission to create external location on `my_azure_storage_cred` storage credential, and then create an external location on the specific path to which `my_azure_storage_cred` has access
> GRANTCREATEEXTERNAL LOCATION ONSTORAGE CREDENTIAL `my_azure_storage_cred`TO`finance`
> CREATEEXTERNAL LOCATION `finance_loc`URL'abfss://container@storageaccount.dfs.core.windows.net/depts/finance'WITH (CREDENTIAL `my_azure_storage_cred`)
COMMENT'finance';
-- Grant read, write, and create table access to the finance location to `finance` user
> GRANTREAD FILES, WRITE FILES, CREATEEXTERNALTABLEONEXTERNAL LOCATION `finance_loc`TO`finance`;
-- `finance` can read from any storage path under abfss://depts/finance but nowhere else
> SELECTcount(1) FROM`delta`.`abfss://container@storageaccount.dfs.core.windows.net/depts/finance`WITH (CREDENTIAL my_azure_storage_cred);
100
> SELECTcount(1) FROM`delta`.`abfss://container@storageaccount.dfs.core.windows.net/depts/hr/employees`WITH (CREDENTIAL my_azure_storage_cred);
Error
-- `finance` can create an external table over specific object within the `finance_loc` location
> CREATETABLE main.default.sec_filings LOCATION 'abfss://container@storageaccount.dfs.core.windows.net/depts/finance/sec_filings';
-- Cannot list files under an external table with a user that doesn't have SELECT permission on it
> LIST 'abfss://container@storageaccount.dfs.core.windows.net/depts/finance/sec_filings'
Error
> LIST 'abfss://container@storageaccount.dfs.core.windows.net/depts/finance/sec_filings/_delta_log'
Error
Demonstrate understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure, using a number of Azure services.