Share via


Customer-managed keys for Unity Catalog

Customer-managed keys (CMK) for Unity Catalog let you protect data managed by Azure Databricks with your own encryption keys. You can configure encryption at the catalog level, using a separate key for each catalog based on data sensitivity or compliance requirements.

For information about CMK for managed services and workspace storage, see Customer-managed keys for encryption.

tc

What is CMK for Unity Catalog?

CMK for Unity Catalog lets you protect data in Unity Catalog catalogs backed by default storage using your own encryption keys from Azure Key Vault.

Azure Databricks encrypts all data at rest by default using managed keys. For granular control, CMK lets you configure a separate customer-managed key for specific catalogs. To deny data access, revoke the key in Azure Key Vault.

Benefits of CMK for Unity Catalog

  • Granular encryption control: Manage encryption at the catalog level, allowing different catalogs to use different encryption keys based on data sensitivity or compliance requirements.
  • Multi-key protection: CMK secures your data against access at the storage layer. Data can only be accessed on authorized workspaces based on fine-grained Unity Catalog policies.
  • Compliance and audit: Meet regulatory requirements for customer-controlled encryption keys and maintain audit trails of key access and usage.
  • Key revocation: Revoke access to the CMK in Azure Key Vault to retain full ownership over your data.
  • Centralized key management: Manage all encryption keys through Azure Key Vault, consistent with your existing security practices.

How CMK for Unity Catalog works

CMK for Unity Catalog on Azure uses Azure Key Vault keys and catalog-level encryption settings to enforce customer-controlled encryption. The following components are central to CMK for Unity Catalog on Azure:

  • Azure Key Vault keys: You create and manage encryption keys in Azure Key Vault. These keys are part of a multi-key encryption hierarchy that Azure Databricks uses to protect data in Unity Catalog catalogs.
  • Direct key reference: You reference your Key Vault key directly on the catalog using the key URI and tenant ID. No account-level CMK configuration object is required.
  • Azure tenant ID: You must provide your Azure tenant ID to allow Azure Databricks to access your Key Vault key.
  • Catalog-level encryption: You configure encryption directly on individual catalogs using Catalog Explorer or the Unity Catalog API. When you create or update a catalog with CMK settings, Azure Databricks encrypts all data written to that catalog using your customer-managed key. This applies only to catalogs backed by default storage.
  • Dynamic enforcement: When data is written to a CMK-protected catalog, Azure Databricks uses your Key Vault key to encrypt the data. When data is read, Azure Databricks requests decryption from Azure Key Vault. If you revoke Azure Databricks access to the key, decryption fails and data becomes inaccessible.

Limitations

  • You can only configure this feature using the REST API. Terraform support isn't available.
  • This feature only applies to catalogs backed by default storage. It doesn't apply to catalogs with external storage locations.

Prerequisites

Before you configure CMK for Unity Catalog on Azure, verify that you have the following:

  • Azure Key Vault key: You must have an existing key in Azure Key Vault. Follow the Azure Key Vault quickstart guide to create a key if needed. Copy the key URI, which has the format: https://<vault-name>.vault.azure.net/keys/<key-name>/<key-version>.
  • Azure tenant ID: You need your Azure tenant ID, which you can find in the Azure portal.
  • Unity Catalog permissions: To create or update catalogs with CMK, you must have CREATE CATALOG and USE CATALOG privileges in Unity Catalog.

Configure CMK for Unity Catalog

Follow these steps to configure customer-managed keys for Unity Catalog catalogs on Azure.

Step 1: Create a new catalog with CMK

Permissions required: CREATE CATALOG in Unity Catalog

To create a new catalog with CMK protection, use the Unity Catalog API:

curl -X POST \
  -H "Authorization: Bearer <api_token>" \
  -H "Content-Type: application/json" \
  https://<workspace_url>/api/2.1/unity-catalog/catalogs \
  -d '{
    "name": "<catalog_name>",
    "comment": "Catalog with customer-managed encryption",
    "storage_mode": "DEFAULT_STORAGE",
    "encryption_settings": {
      "azure_key_vault_key_id": "https://<vault-name>.vault.azure.net/keys/<key-name>/<key-version>",
      "azure_encryption_settings": {
        "azure_tenant_id": "<tenant-id>"
      }
    }
  }'

Replace the following values:

  • <workspace_url>: Your Azure Databricks workspace URL
  • <api_token>: Your Azure Databricks personal access token
  • <catalog_name>: The name for your new catalog (for example, finance_data or customer_pii)
  • <vault-name>: Your Azure Key Vault name
  • <key-name>: Your key name in Azure Key Vault
  • <key-version>: The specific version of your key
  • <tenant-id>: Your Azure tenant ID

Step 2: Update an existing catalog with CMK

Permissions required: MANAGE on the catalog or ownership of the catalog

To add or change CMK protection on an existing catalog that uses default storage:

  1. In Catalog Explorer, click the catalog name.
  2. Click the Details tab.
  3. Under Advanced, click Encryption settings.
  4. In the dialog, select your customer-managed key.
  5. Click Save.

You can change the key associated with a catalog at any time by repeating these steps. You can't disable CMK after it's enabled on a catalog.

Important

When you add CMK to an existing catalog, Azure Databricks encrypts only new data written to the catalog with your customer-managed key. Azure Databricks-managed keys continue to encrypt existing data. To encrypt all data with your customer-managed key, you must rewrite the existing data.

Verify CMK configuration

To verify that your catalog is configured with CMK, use the Unity Catalog API to get the catalog details:

curl -X GET \
  -H "Authorization: Bearer <api_token>" \
  -H "Content-Type: application/json" \
  "https://<workspace_url>/api/2.1/unity-catalog/catalogs/<catalog_name>"

The response includes the encryption_settings field for catalogs configured with CMK:

{
  "name": "<catalog_name>",
  "storage_mode": "DEFAULT_STORAGE",
  "encryption_settings": {
    "customer_managed_key_id": "<cmk-id>"
  }
}

Revoke access to encrypted data

To deny Azure Databricks access to data encrypted with your customer-managed key, disable your key in Azure Key Vault:

  1. In the Azure portal, go to your Key Vault.
  2. Locate your key and disable it.

After you disable the key, Azure Databricks can no longer decrypt data in catalogs using this key. Any attempts to read data from these catalogs fail with a decryption error.

There might be a delay between the time you disable the key and when data access is denied.

To restore access, re-enable the key in Azure Key Vault.