Create clusters & SQL warehouses with Unity Catalog access
This article shows how to create an Azure Databricks cluster or SQL warehouse that can access data in Unity Catalog.
SQL warehouses are used to run Databricks SQL workloads, such as queries, dashboards, and visualizations. SQL warehouses allow you to access Unity Catalog data and run Unity Catalog-specific commands by default, as long as your workspace is attached to a Unity Catalog metastore.
Clusters are used to run workloads using notebooks or automated jobs. To create a cluster that can access Unity Catalog, the workspace you are creating the cluster in must be attached to a Unity Catalog metastore and must use a Unity-Catalog-capable access mode (shared or single user).
You can work with data in Unity Catalog using either of these compute resources, depending on the environment you are using: SQL warehouses for SQL Editor or clusters for notebooks.
What is cluster access mode?
When you create any cluster in Azure Databricks, you must select an access mode that is specific to the type of workload that you want to use the cluster for. Unity Catalog enforces security using specific cluster access modes. If a cluster is not configured with one of the Unity-Catalog-capable access modes (shared or single user), the cluster can’t access data in Unity Catalog.
The following table lists all available access modes:
Access Mode | Visible to user | UC Support | Supported Languages | Notes |
---|---|---|---|---|
Single user | Always | Yes | Python, SQL, Scala, R | Can be assigned to and used by a single user. See single user limitations. |
Shared | Always (Premium plan required) | Yes | Python (on Databricks Runtime 11.3 LTS and above), SQL, Scala (on Unity Catalog-enabled clusters using Databricks Runtime 13.3 and above) | Can be used by multiple users with data isolation among users. See shared limitations. |
No Isolation Shared | Admins can hide this cluster type by enforcing user isolation in the admin settings page. | No | Python, SQL, Scala, R | There is a related account-level setting for No Isolation Shared clusters. |
Custom | Hidden (For all new clusters) | No | Python, SQL, Scala, R | This option is shown only if you have existing clusters without a specified access mode. |
You can upgrade an existing cluster to meet the requirements of Unity Catalog by setting its cluster access mode to Single User or Shared. There are additional access mode limitations for Structured Streaming on Unity Catalog, see Structured Streaming support.
Important
Access mode in the Clusters API is not supported.
Do init scripts and libraries work with Unity Catalog?
In Databricks Runtime 13.3 LTS and above, init scripts and libraries are supported on all access modes. You can add libraries and init scripts to the allowlist in Unity Catalog. This allows users to leverage these artifacts on compute configured with shared access mode. See Allowlist libraries and init scripts on shared compute.
For a reference on requirements and support for libraries and init scripts, see Compute compatibility with libraries and init scripts.
Single user access mode limitations
To read from a view, you must have
SELECT
on all referenced tables and views.Dynamic views are not supported.
When used with credential passthrough, Unity Catalog features are disabled.
You cannot use a single user cluster to query tables created by a Unity Catalog-enabled Delta Live Tables pipeline, including streaming tables and materialized views created in Databricks SQL. To query tables created by a Delta Live Tables pipeline, you must use a shared cluster using Databricks Runtime 13.1 and above.
Shared access mode limitations
When used with credential passthrough, Unity Catalog features are disabled.
Custom containers are not supported.
Spark-submit jobs are not supported.
Databricks Runtime ML is not supported.
Cannot use R, RDD APIs, or clients that directly read the data from cloud storage, such as DBUtils.
Can use Scala only on Databricks Runtime 13.3 and above.
The following limitations exist for user-defined functions (UDFs):
- Cannot use Hive or Scala UDFs
- In Databricks Runtime 13.1 and below, you cannot use Python UDFs, including UDAFs, UDTFs, and Pandas on Spark (
applyInPandas
andmapInPandas
). In Databricks Runtime 13.2 and above, Python UDFs are supported. - See User-defined functions (UDFs) in Unity Catalog.
Must run commands on cluster nodes as a low-privilege user forbidden from accessing sensitive parts of the filesystem. In Databricks Runtime 11.3 and below, you can only create network connections to ports 80 and 443.
Cannot connect to the instance metadata service or Azure WireServer.
Attempts to get around these restrictions will fail. These restrictions are in place so that users can’t access unprivileged data through the cluster.
Requirements
- Your Azure Databricks account must be on the Premium plan.
- You must have permission to create a cluster. See Configure cluster creation entitlement.
Create a cluster that can access Unity Catalog
A cluster is designed for running workloads such as notebooks and automated jobs.
To create a cluster that can access Unity Catalog, the workspace must be attached to a Unity Catalog metastore.
Databricks Runtime requirements
Unity Catalog requires clusters that run Databricks Runtime 11.3 LTS or above.
Steps
To create a cluster:
In the sidebar, click New > Cluster.
Choose the access mode you want to use.
For clusters that run on standard Databricks Runtime versions, select either Single User or Shared access mode to connect to Unity Catalog. If you use Databricks Runtime for Machine Learning, you must select Single User access mode to connect to Unity Catalog. See What is cluster access mode?
Select a Databricks Runtime version of 11.3 LTS or above.
Complete your cluster configuration and click Create Cluster.
When the cluster is available, it will be able to run workloads that use Unity Catalog.
Create a SQL warehouse that can access Unity Catalog
A SQL warehouse is required to run workloads in Databricks SQL, such as queries, dashboards, and visualizations. By default, all SQL Warehouses can connect to Unity Catalog. See Configure SQL warehouses for specific configuration options.
Next steps
Feedback
Submit and view feedback for