Share via


Enable OneLake catalog federation

Important

This feature is in Beta. Workspace admins can control access to this feature by enabling the OneLake Read Federation feature in the Previews page. See Manage Azure Databricks previews.

After enabling the feature, you must restart your compute cluster or SQL warehouse.

This article shows how to read data in OneLake using catalog federation. This allows Unity Catalog queries to run directly against OneLake storage.

OneLake federation enables you to analyze data stored in your Lakehouse or Warehouse without copying it, bringing powerful analytics and AI/BI capabilities in Azure Databricks directly to your OneLake data. Data access is read-only.

Before you begin

You must meet the following requirements to run federated queries on OneLake using catalog federation:

Workspace requirements:

  • Workspace enabled for Unity Catalog.

Compute requirements:

  • Network connectivity from your compute resource to the target database systems. See Networking recommendations for Lakehouse Federation.
  • Azure Databricks compute must use Databricks Runtime 18.0 or above and Standard access mode. Dedicated access mode isn't supported.
  • SQL warehouses must be pro, and must use 2025.35 or above. Serverless SQL warehouses aren't supported.

Permissions required:

  • To create a connection, you must be a metastore admin or a user with the CREATE CONNECTION and CREATE STORAGE CREDENTIAL privileges on the Unity Catalog metastore attached to the workspace.
  • To create a foreign catalog, you must have the CREATE CATALOG permission on the metastore and be either the owner of the connection or have the CREATE FOREIGN CATALOG privilege on the connection.

Additional permission requirements are specified in each task-based section that follows.

  • You must have permissions to create resources in Azure, configure access in Fabric, and manage Unity Catalog in Azure Databricks.
  • Supported authentication methods:
    • Azure Managed Identity via an Access Connector for Azure Databricks
    • Azure service principal
  • After you enable this Beta feature, you must restart your compute cluster or SQL warehouse.

The following Fabric data items are supported:

  • Fabric Lakehouse
  • Fabric Warehouse

Set up catalog federation

The following steps guide you through creating the connection and foreign catalog for OneLake federation.

Step 1: Create an access connector in Azure

The Databricks Access Connector creates a managed identity that Azure Databricks uses to authenticate with OneLake.

  1. In the Azure Portal, search for and create a new Access Connector for Azure Databricks resource.

  2. Follow the prompts to create the connector. This resource creates a system-assigned Managed Identity.

  3. Record the Resource ID of the newly created connector. You need this ID when creating the Unity Catalog storage credential.

    The resource ID is in the format:

    /subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.Databricks/accessConnectors/<connector-name>
    

For more information about creating access connectors and using managed identities, see Use Azure managed identities in Unity Catalog to access storage.

Step 2: Grant permissions in Fabric

Grant the managed identity or service principal permission to read the Fabric data.

  1. In the Fabric portal, navigate to the workspace that contains your Lakehouse or Warehouse data items.
  2. In the workspace, click the Workspace settings gear icon, then click Manage access.
  3. Click Add people or groups.
  4. Search for and select the Managed Identity or service principal. The name should match the access connector you created earlier.
  5. Assign the identity the Member role at minimum. You can also assign Contributor or Admin roles.
  6. Click Add.
  7. Verify that the identity appears in the access list with the appropriate role. Permissions on individual Lakehouse and Warehouse items are inherited from the workspace-level role.

Step 3: Create a storage credential

Create a storage credential in Unity Catalog that references the access connector you created earlier.

  1. In your Azure Databricks workspace, click Data icon. Catalog.
  2. At the top of the Catalog pane, click the Add or plus icon plus icon and select Create a credential from the menu.
  3. In the Create a new credential modal, for Credential Type, choose Azure Managed Identity.
  4. For Credential name, enter a Name for the storage credential (for example, onelake_storage_cred).
  5. For Access connector ID, enter the resource ID of the access connector you created earlier.
  6. (Optional) Add a comment.
  7. Click Create.

Step 4: Create a Unity Catalog connection

Create a Unity Catalog connection that uses the storage credential to access OneLake.

  1. In your Azure Databricks workspace, click Data icon. Catalog.
  2. At the top of the Catalog pane, click the Add or plus icon plus icon and select Create a connection from the menu.
  3. On the Connection basics page, enter a Connection name (for example, onelake_connection).
  4. Select a Connection type of OneLake.
  5. (Optional) Add a comment.
  6. Click Next.
  7. On the Connection details page, for Credential, select the storage credential you created in the previous step (for example, onelake_storage_cred).
  8. For Workspace, enter the workspace ID of your OneLake workspace.
  9. Click Create connection.

After the connection is created, you can leave this modal.

Step 5: Create a foreign catalog

A foreign catalog links a specific Fabric data item to a catalog in Unity Catalog.

Get the Fabric data item ID

  1. In the Fabric portal, navigate to the target Lakehouse or Warehouse.

  2. Copy the Data Item ID, which is a GUID (for example, f089354e-8366-4e18-aea3-4cb4a3a50b48).

    You can find this GUID in the Fabric UI or by copying it from your browser URL when you navigate to the Lakehouse or Warehouse:

    https://app.powerbi.com/groups/<workspace-id>/lakehouses/<data-item-id>?experience=power-bi
    

Create the catalog

  1. In your Databricks workspace, click Data icon. Catalog.
  2. At the top of the Catalog pane, click the Add or plus icon plus icon and select Create a catalog from the menu.
  3. On the Create a new catalog dialog, enter a name for the catalog (for example, fabric_sales).
  4. Select a Type of Foreign.
  5. Select the Connection you created in Step 4 (for example, onelake_connection).
  6. For Data item, enter the data item ID you copied from the Fabric portal.
  7. (Optional) Click Test connection to validate your configuration.
  8. Click Create.

The catalog syncs automatically, making the Fabric tables available immediately.

Grant permissions on federated tables

After setting up catalog federation, users must have the appropriate Unity Catalog permissions to access federated tables:

  • All users need USE CATALOG and USE SCHEMA permissions on the catalog and schema respectively.
  • To read from the federated table, users need the SELECT permission.

For more information about Unity Catalog privileges and how to grant them, see Manage privileges in Unity Catalog.

Query OneLake data

After the setup is complete, you can find and query OneLake data in Unity Catalog.

Browse the catalog

  1. In your Databricks workspace, navigate to Catalog Explorer.
  2. Locate the catalog you created (for example, fabric_sales).
  3. Expand the catalog to see the synchronized schemas and tables from the Fabric Lakehouse or Warehouse.

Run queries

Use the three-part naming convention (catalog.schema.table) in Databricks SQL or notebooks:

SELECT COUNT(*)
FROM fabric_sales.silver.customer_details;
SELECT
  customer_id,
  customer_name,
  total_purchases
FROM fabric_sales.silver.customer_details
WHERE total_purchases > 1000
ORDER BY total_purchases DESC
LIMIT 10;

Limitations

OneLake federation has the following limitations:

  • Read-only access: Only SELECT queries are supported. Write operations are not available.
  • Authentication: Azure Managed Identity and Azure service principal are the supported authentication methods.
  • Supported data items: Only Fabric Lakehouse and Warehouse items are supported.
  • Compute requirements: You must use Databricks Runtime 18.0 or higher. Dedicated access mode and serverless compute aren't supported.