Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Unity Catalog is a unified data and AI governance solution built directly into the Azure Databricks platform. This is an overview of key concepts in Unity Catalog and how to use Unity Catalog to govern data.
Key pillars of Unity Catalog include the following:
- Unified access control: Unity Catalog offers a single place to manage permissions for tables, files, models, and other objects from a single interface.
- Data discovery: Unity Catalog empowers users to find and understand data assets through a searchable interface enriched with tags, descriptions, and metadata.
- Automated lineage tracking: Automatically track the flow of data and how it's transformed from source to final views and dashboards.
- Auditing: Maintain a complete record of all data access and system activity to satisfy security requirements and regulatory compliance.
- Data quality monitoring: Proactively track the health of your data assets with built-in profiling and alerts that catch anomalies before they reach downstream consumers.
- Secure data sharing: Securely exchange live data across organizations and clouds using the open Delta Sharing protocol, eliminating the need for complex ETL or data copying.
Unity Catalog is also available as an open-source implementation. See the announcement blog and the public Unity Catalog GitHub repo.
The Unity Catalog object model
In Unity Catalog, every asset you govern is modeled as an object. More specifically, these objects are called securable objects in Unity Catalog. You can use access control policies and metadata such as tags to govern these securable objects.
Securable objects live within the Unity Catalog object model hierarchy, rooted at a special object called the metastore. Under it, data assets such as tables, views, volumes, functions, and models follow a three-level namespace (catalog.schema.object). Other objects, such as storage credentials, external locations, connections, and shares, sit directly under the metastore.

This hierarchy is the foundation of how Unity Catalog organizes assets and enforces governance. To understand the Unity Catalog object model and each securable object in more detail, see Unity Catalog securable objects reference. To understand how the permissions model works in the context of the Unity Catalog object model, see Unity Catalog permissions model concepts.
Admin roles
Administrators are responsible for overseeing governance in Unity Catalog. Following are the different levels of admin roles and their default privileges:
- Account admins can create metastores, link workspaces to metastores, add users, and assign privileges on metastores.
- Workspace admins can add users to a workspace, and manage many workspace-specific objects like jobs and notebooks. Depending on the workspace, workspace admins can also have many privileges on the metastore that is attached to the workspace.
- Metastore admins are optional roles that can manage table and volume storage at the metastore level. It is also convenient if you want to manage data centrally across multiple workspaces in a region.
For more information, see Admin privileges in Unity Catalog.
Granting and revoking access to securable objects
Privileged users can grant and revoke access to securable objects at any level in the hierarchy, including the metastore itself. Access to an object implicitly grants the same access to all children of that object, unless access is revoked.
You can use typical ANSI SQL commands to grant and revoke access to objects in Unity Catalog. For example:
GRANT CREATE TABLE ON SCHEMA mycatalog.myschema TO `finance-team`;
You can also use Catalog Explorer, the Databricks CLI, and REST APIs to manage object permissions.

Metastore admins, owners of an object, and users with the MANAGE privilege on an object can grant and revoke access. To learn how to manage privileges in Unity Catalog, see Manage privileges in Unity Catalog.
Default access to database objects in Unity Catalog
Unity Catalog operates on the principle of least privilege, where users have the minimum access they need to perform their required tasks. When a workspace is created, non-admin users have access only to the automatically-provisioned Workspace catalog, which makes this catalog a convenient place for users to try out the process of creating and accessing database objects in Unity Catalog. See Workspace catalog privileges.
Managed versus external tables and volumes
Tables and volumes can be managed or external.
- Managed tables are fully managed by Unity Catalog, which means that Unity Catalog manages both the governance and the underlying data files for each managed table. Managed tables are stored in a Unity Catalog-managed location in your cloud storage. Managed tables always use the Delta Lake format. You can store managed tables at the metastore, catalog, or schema levels.
- External tables are tables whose access from Azure Databricks is managed by Unity Catalog, but whose data lifecycle and file layout are managed using your cloud provider and other data platforms. Typically you use external tables to register large amounts of your existing data in Azure Databricks, or if you also require write access to the data using tools outside of Azure Databricks. External tables are supported in multiple data formats. Once an external table is registered in a Unity Catalog metastore, you can manage and audit Azure Databricks access to it---and work with it---just like you can with managed tables.
- Managed volumes are fully managed by Unity Catalog, which means that Unity Catalog manages access to the volume's storage location in your cloud provider account. When you create a managed volume, it is automatically stored in the managed storage location assigned to the containing schema.
- External volumes represent existing data in storage locations that are managed outside of Azure Databricks, but registered in Unity Catalog to control and audit access from within Azure Databricks. When you create an external volume in Azure Databricks, you specify its location, which must be on a path that is defined in a Unity Catalog external location.
Databricks recommends managed tables and volumes for most use-cases, because they allow you to take full advantage of Unity Catalog governance capabilities and performance optimizations. For information about typical use-cases for external tables and volumes, see Managed and external tables and Managed and external volumes.
See also:
- Unity Catalog managed tables in Azure Databricks for Delta Lake and Apache Iceberg
- Work with external tables
- Managed versus external volumes.
Cloud storage and data isolation
Unity Catalog uses cloud storage in two primary ways:
- Managed storage: default locations for managed tables and managed volumes (unstructured, non-tabular data) that you create in Azure Databricks. These managed storage locations can be defined at the metastore, catalog, or schema level. You create managed storage locations in your cloud provider, but their lifecycle is fully managed by Unity Catalog.
- Storage locations where external tables and volumes are stored. These are tables and volumes whose access from Azure Databricks is managed by Unity Catalog, but whose data lifecycle and file layout are managed using your cloud provider and other data platforms. Typically you use external tables or volumes to register large amounts of your existing data in Azure Databricks, or if you also require write access to the data using tools outside of Azure Databricks.
Governing access to cloud storage using external locations
Both managed storage locations and storage locations where external tables and volumes are stored use external location securable objects to manage access from Azure Databricks. External location objects reference a cloud storage path and the storage credential required to access it. Storage credentials are themselves Unity Catalog securable objects that register the credentials required to access a particular storage path. Together, these securables ensure that access to storage is controlled and tracked by Unity Catalog.
The diagram below shows how external locations reference storage credentials and cloud storage locations.

In this diagram:
- Each external location references a storage credential and a cloud storage location.
- Multiple external locations can reference the same storage credential. Storage credential 1 grants access to everything under the path
bucket/tables/*, so both External location A and External location B reference it.
For more information, see How does Unity Catalog govern access to cloud storage?.
Managed storage location hierarchy
The level at which you define managed storage in Unity Catalog depends on your preferred data isolation model. Your organization may require that certain types of data be stored within specific accounts or buckets in your cloud tenant.
Unity Catalog gives you the ability to configure managed storage locations at the metastore, catalog, or schema level to satisfy such requirements.
For example, let's say your organization has a company compliance policy that requires production data relating to human resources to reside in the container abfss://mycompany-hr-prod@storage-account.dfs.core.windows.net. In Unity Catalog, you can achieve this requirement by setting a location on a catalog level, creating a catalog called, for example hr_prod, and assigning the location abfss://mycompany-hr-prod@storage-account.dfs.core.windows.net/unity-catalog to it. This means that managed tables or volumes created in the hr_prod catalog (for example, using CREATE TABLE hr_prod.default.table …) store their data in abfss://mycompany-hr-prod@storage-account.dfs.core.windows.net/unity-catalog. Optionally, you can choose to provide schema-level locations to organize data within the hr_prod catalog at a more granular level.
If storage isolation is not required for some catalogs, you can optionally set a storage location at the metastore level. This location serves as a default location for managed tables and volumes in catalogs and schemas that don't have assigned storage. Typically, however, Databricks recommends that you assign separate managed storage locations for each catalog.
The system evaluates the hierarchy of storage locations from schema to catalog to metastore.
For example, if a table myCatalog.mySchema.myTable is created in my-region-metastore, the table storage location is determined according to the following rule:
- If a location has been provided for
mySchema, it will be stored there. - If not, and a location has been provided on
myCatalog, it will be stored there. - Finally, if no location has been provided on
myCatalog, it will be stored in the location associated with themy-region-metastore.

For more information, see Specify a managed storage location in Unity Catalog.
Environment isolation using workspace-catalog binding
By default, catalog owners (and metastore admins, if they are defined for the account) can make a catalog accessible to users in multiple workspaces attached to the same Unity Catalog metastore.
Organizational and compliance requirements often specify that you keep certain data, like personal data, accessible only in certain environments. You may also want to keep production data isolated from development environments or ensure that certain data sets and domains are never joined together.
In Azure Databricks, the workspace is the primary data processing environment, and catalogs are the primary data domain. Unity Catalog lets metastore admins, catalog owners, and users with the MANAGE permission assign, or “bind,” catalogs to specific workspaces. These environment-aware bindings give you the ability to ensure that only certain catalogs are available within a workspace, regardless of the specific privileges on data objects granted to a user. If you use workspaces to isolate user data access, however, you might want to limit catalog access to specific workspaces in your account, to ensure that certain kinds of data are processed only in those workspaces. You might want separate production and development workspaces, for example, or a separate workspace for processing personal data. This is known as workspace-catalog binding. See Limit catalog access to specific workspaces.

Note
For increased data isolation, you can also bind cloud storage access and cloud service access to specific workspaces. See (Optional) Assign a storage credential to specific workspaces, (Optional) Assign an external location to specific workspaces, and (Optional) Assign a service credential to specific workspaces.
How do I set up Unity Catalog for my organization?
To use Unity Catalog, your Azure Databricks workspace must be enabled for Unity Catalog, which means that the workspace is attached to a Unity Catalog metastore.
How does a workspace get attached to a metastore? It depends on the account and the workspace:
- Typically, when you create a Azure Databricks workspace in a region for the first time, the metastore is created automatically and attached to the workspace.
- For some older accounts, an account admin must create the metastore and assign the workspaces in that region to the metastore. For instructions, see Create a Unity Catalog metastore.
- If an account already has a metastore assigned for a region, an account admin can decide whether to attach the metastore automatically to all new workspaces in that region. See Enable a metastore to be automatically assigned to new workspaces.
Whether or not your workspace was enabled for Unity Catalog automatically, the following steps are also required to get started with Unity Catalog:
- Create catalogs and schemas to contain database objects like tables and volumes.
- Create managed storage locations to store the managed tables and volumes in these catalogs and schemas.
- Grant user access to catalogs, schemas, and database objects.
Workspaces that are automatically enabled for Unity Catalog provision a workspace catalog with broad privileges granted to all workspace users. This catalog is a convenient starting point for trying out Unity Catalog.
For detailed setup instructions, see Get started with Unity Catalog.
Upgrading an existing workspace to Unity Catalog
To learn how to upgrade a non-Unity Catalog workspace to Unity Catalog, see Upgrade a Azure Databricks workspaces to Unity Catalog.
Unity Catalog requirements and restrictions
Unity Catalog requires specific types of compute and file formats, described below. Also listed below are some Azure Databricks features that are not fully supported in Unity Catalog on all Databricks Runtime versions.
Region support
All regions support Unity Catalog. For details, see Azure Databricks regions.
Compute requirements
Unity Catalog is supported on clusters that run Databricks Runtime 11.3 LTS or above. Unity Catalog is supported by default on all SQL warehouse compute versions.
Clusters running on earlier versions of Databricks Runtime do not provide support for all Unity Catalog GA features and functionality.
To access data in Unity Catalog, clusters must be configured with the correct access mode. Unity Catalog is secure by default. If a cluster is not configured with standard or dedicated access mode, the cluster can't access data in Unity Catalog. See Access modes.
For detailed information about Unity Catalog functionality changes in each Databricks Runtime version, see the release notes.
File format support
Unity Catalog supports the following table formats:
- Managed tables must use the
deltaoricebergtable format. - External tables can use
delta,CSV,JSON,avro,parquet,ORC, ortext.
Limitations
Unity Catalog has the following limitations. Some of these are specific to older Databricks Runtime versions and compute access modes.
Structured Streaming workloads have additional limitations, depending on Databricks Runtime and access mode. See Standard compute requirements and limitations and Dedicated compute requirements and limitations.
Databricks releases new functionality that shrinks this list regularly.
Groups that were previously created in a workspace (that is, workspace-level groups) cannot be used in Unity Catalog
GRANTstatements. This is to ensure a consistent view of groups that can span across workspaces. To use groups inGRANTstatements, create your groups at the account level and update any automation for principal or group management (such as SCIM, Okta and Microsoft Entra ID connectors, and Terraform) to reference account endpoints instead of workspace endpoints. See Group sources.Workloads in R do not support the use of dynamic views for row-level or column-level security on compute running Databricks Runtime 15.3 and below.
- Use a dedicated compute resource running Databricks Runtime 15.4 LTS or above for workloads in R that query dynamic views. Such workloads also require a workspace that is enabled for serverless compute. For details, see Fine-grained access control on dedicated compute.
Shallow clones are unsupported in Unity Catalog on compute running Databricks Runtime 12.2 LTS and below. You can use shallow clones to create managed tables on Databricks Runtime 13.3 LTS and above. You cannot use them to create external tables, regardless of Databricks Runtime version. See Shallow clone for Unity Catalog tables.
Bucketing is not supported for Unity Catalog tables. If you run commands that try to create a bucketed table in Unity Catalog, it will throw an exception.
Writing to the same path or Delta Lake table from workspaces in multiple regions can lead to unreliable performance if some clusters access Unity Catalog and others do not.
Manipulating partitions for external tables using commands like
ALTER TABLE ADD PARTITIONrequires partition metadata logging to be enabled. See Partition discovery for external tables.When using overwrite mode for tables not in Delta format, the user must have the CREATE TABLE privilege on the parent schema and must be the owner of the existing object OR have the MODIFY privilege on the object.
Python UDFs are not supported in Databricks Runtime 12.2 LTS and below. This includes UDAFs, UDTFs, and Pandas on Spark (
applyInPandasandmapInPandas). Python scalar UDFs are supported in Databricks Runtime 13.3 LTS and above.Scala UDFs are not supported in Databricks Runtime 14.1 and below on compute with standard access mode. Scalar UDFs are supported in Databricks Runtime 14.2 and above on compute with standard access mode.
Standard Scala thread pools are not supported. Instead, use the special thread pools in
org.apache.spark.util.ThreadUtils, for example,org.apache.spark.util.ThreadUtils.newDaemonFixedThreadPool. However, the following thread pools inThreadUtilsare not supported:ThreadUtils.newForkJoinPooland anyScheduledExecutorServicethread pool.
- Azure diagnostic logs only log Unity Catalog events at the workspace level. To view account-level actions, you must use the audit log system table. See Audit log system table reference.
Models registered in Unity Catalog have additional limitations. See Limitations.
Resource quotas
Unity Catalog enforces resource quotas on all securable objects. These quotas are listed in Resource limits. If you expect to exceed these resource limits, contact your Azure Databricks account team.
You can monitor your quota usage using the Unity Catalog resource quotas APIs. See Monitor your usage of Unity Catalog resource quotas.