Enable customer-managed keys for managed services

Note

This feature requires the Premium Plan.

For additional control of your data, you can add your own key to protect and control access to some types of data. Azure Databricks has three customer-managed key features for different types of data and locations. To compare them, see Customer-managed keys for encryption.

Managed services data in the Azure Databricks control plane is encrypted at rest. You can add a customer-managed key for managed services to help protect and control access to the following types of encrypted data:

After you add a customer-managed key encryption for a workspace, Azure Databricks uses your key to control access to the key that encrypts future write operations to your workspace’s managed services data. Existing data is not re-encrypted. The data encryption key is cached in memory for several read and write operations and evicted from memory at a regular interval. New requests for that data require another request to your cloud service’s key management system. If you delete or revoke your key, reading or writing to the protected data fails at the end of the cache time interval.

You can rotate (update) the customer-managed key at a later time. See Rotate the key.

Important

After you run the key rotation command, you must keep your old KMS key available to Azure Databricks for 24 hours.

Note

This feature does not encrypt data stored outside of the control plane. To encrypt data in your workspace’s DBFS root storage, see Configure customer-managed keys for DBFS root.

Step 1: Set up a Key Vault

You must create an Azure Key Vault instance and set its permissions. You can do this through the Azure portal, CLI, or APIs.

These instructions offer details for two deployment options:

Step 1a: Use the Azure portal

  1. Create or select a Key Vault:
    • To create a Key Vault, go to the Azure portal page for creating a Key Vault. Click + Create. Enter the resource group name, Key Vault name, region, and pricing tier. Click Review + create and then click Create.
    • To use an existing Key Vault, copy its Key Vault name for the next step.
  2. Get the object ID of the AzureDatabricks application
    1. In the Azure portal, go to Azure Active Directory.
    2. Select Enterprise Applications from the sidebar menu.
    3. Search for AzureDatabricks and click the Enterprise Application in the results.
    4. From Properties, copy the object ID.
  3. Add an access policy to Key Vault using the Azure portal
    1. Navigate to the Azure Key Vault that you will use to configure customer managed keys for managed services for your workspace.

    2. Click the Access policies tab from the left-side panel.

    3. Select the Create button found at the top of the page.

    4. Under the Key permissions section in the Permissions tab, enable Get, Unwrap Key, and Wrap key.

    5. Click Next.

    6. On the Principal tab, type AzureDatabricks and scroll to the first Enterprise Application result that has an Application ID of 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d and select it.

      Select the AzureDatabricks application with ID 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d

    7. Continue to the Review + create tab and click b.

Step 1b: Use the Azure CLI

Use the Azure CLI to complete the following instructions.

  1. Create a Key Vault or select an existing Key Vault.

    • To create a Key Vault, replace the items in brackets with your region, Key Vault name, and resource group name in this Azure CLI command:

      az keyvault create --location <region> \
                         --name <key-vault-name> \
                         --resource-group <resource-group-name>
      
    • To use an existing Key Vault, copy the Key Vault name for the next step.

  2. Get the object ID of the AzureDatabricks application with the Azure CLI:

    az ad sp show --id "2ff814a6-3304-4ab8-85cb-cd0e6f879c1d" \
                  --query "id" \
                  --output tsv
    
  3. Confirm that you are using the correct Azure subscription:

    az account set --subscription {subscription_id}
    
  4. Add an access policy to Key Vault with the following command. Replace <key-vault-name> with the vault name that you used in the previous step and replace <object-id> with the object ID of the AzureDatabricks application.

    az keyvault set-policy -n <key-vault-name> \
                           --key-permissions get wrapKey unwrapKey  \
                           --object-id <object-id>
    

Step 2: Create or select a key

Create a key under the Key Vault. The KeyType must be RSA, but RSA Key Size and HSM do not matter. The KeyVault must be in the same Azure tenant as your Azure Databricks workspace. Use whatever tooling you prefer to use: Azure portal, Azure CLI, or other tooling.

To create the key in CLI, run this command:

az keyvault key create --name <key-name> \
                       --vault-name <key-vault-name>

Make note of the following values, which you can get from the key ID in the kid property in the response. You will use them in subsequent steps:

  • Key vault URL: The beginning part of the key ID that includes the Key Vault name. It has the form https://<key-vault-name>.vault.azure.net.
  • Key name: Name of your key.
  • Key version: Version of the key.

The full key ID has the form <key-vault-URL>/keys/<key-name>/<key-version>.

If instead you use an existing key, get and copy these values for your key so you can use them in the next steps. Check to confirm that your existing key is enabled before proceeding.

Step 3: Add a key to a workspace

You can deploy a new workspace with a customer-managed key for managed services or add a customer-managed key to an existing workspace. You can do both with ARM templates, Azure portal, CLI, or other tools.

This section includes details for the following two deployment options:

Step 3a: Use the Azure portal without a template

  1. Go to the Azure Portal homepage.

  2. Click Create a resource in the top left corner of the page.

  3. Within the search bar, type Azure Databricks and click the Azure Databricks option.

  4. Click Create in the Azure Databricks widget.

  5. Enter values for the input fields on the Basics and Networking tabs.

  6. After you reach the Encryption tab:

    • For creating a workspace, enable Use your own key in the Managed Services section.
    • For updating a workspace, enable Managed Services.
  7. Set the encryption fields.

    Show fields in the Managed Disks section of the Azure Databricks blade

    • In the Key Identifier field, paste the Key Identifier of your Azure Key Vault key.
    • In the Subscription dropdown, enter the subscription name of your Azure Key Vault key.
  8. Complete the remaining tabs and click Review + Create (for new workspace) or Save (for updating a workspace).

Step 3b: Use an ARM template

The following ARM template creates a new workspace with a customer-managed key, using the API version 2023-02-01 for resource Microsoft.Databricks/workspaces. Save this text locally to a file named databricks-cmk-template.json.

Important

This example template does not include all possible Azure Databricks features, such as providing your own VNet in which to deploy the workspace. If you already use a template, merge this template’s extra parameters, resources, and outputs into your existing template.

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "workspaceName": {
      "type": "string",
      "metadata": {
        "description": "The name of the Azure Databricks workspace to create."
      }
    },
    "pricingTier": {
      "type": "string",
      "defaultValue": "premium",
      "allowedValues": [
        "standard",
        "premium"
      ],
      "metadata": {
        "description": "The pricing tier of workspace."
      }
    },
    "location": {
      "type": "string",
      "defaultValue": "[resourceGroup().location]",
      "metadata": {
        "description": "Location for all resources."
      }
    },
    "apiVersion": {
      "type": "string",
      "defaultValue": "2023-02-01",
      "allowedValues":[
        "2023-02-01",
        "2021-04-01-preview"
      ],
      "metadata": {
        "description": "The api version to create the workspace resources"
      }
    },
    "keyvaultUri": {
      "type": "string",
      "metadata": {
        "description": "The Key Vault URI for customer-managed key for managed services"
      }
    },
    "keyName": {
      "type": "string",
      "metadata": {
        "description": "The key name used for customer-managed key for managed services"
      }
    },
    "keyVersion": {
      "type": "string",
      "metadata": {
        "description": "The key version used for customer-managed key for managed services"
      }
    }
  },
  "variables": {
    "managedResourceGroupName": "[concat('databricks-rg-', parameters('workspaceName'), '-', uniqueString(parameters('workspaceName'), resourceGroup().id))]"
  },
  "resources": [
    {
      "type": "Microsoft.Databricks/workspaces",
      "name": "[parameters('workspaceName')]",
      "location": "[parameters('location')]",
      "apiVersion": "[parameters('apiVersion')]",
      "sku": {
        "name": "[parameters('pricingTier')]"
      },
      "properties": {
        "ManagedResourceGroupId": "[concat(subscription().id, '/resourceGroups/', variables('managedResourceGroupName'))]",
        "encryption": {
          "entities": {
             "managedServices": {
                "keySource": "Microsoft.Keyvault",
                "keyVaultProperties": {
                   "keyVaultUri": "[parameters('keyvaultUri')]",
                   "keyName": "[parameters('keyName')]",
                   "keyVersion": "[parameters('keyVersion')]"
                }
             }
          }
        }
      }
    }
  ],
  "outputs": {
    "workspace": {
      "type": "object",
      "value": "[reference(resourceId('Microsoft.Databricks/workspaces', parameters('workspaceName')))]"
    }
  }
}

If you use another template already, you can merge this template’s parameters, resources, and outputs into your existing template.

To use this template to create or update a workspace, choose one of these deployment options:

Use the Azure CLI to create a workspace with an ARM template

To create a new workspace with Azure CLI, run the following command:

az deployment group create --resource-group <resource-group-name>  \
                           --template-file <file-name>.json \
                           --parameters workspaceName=<new-workspace-name> \
                           keyvaultUri=<keyvaultUrl> \
                           keyName=<keyName> keyVersion=<keyVersion>

Use the Azure CLI to update a workspace with an ARM template

To update an existing workspace to use a customer-managed key workspace (or to rotate the existing key) using Azure CLI:

  1. If your ARM template that deployed the workspace never added customer-managed keys, add the resources.properties.encryption section and its related parameters. See the template earlier in this article.

    1. Add the resources.properties.encryption section from the template.
    2. In the parameters section, add three new parameters keyvaultUri, keyName, and keyVersion from the template.
  2. Run the same command as for creating a new workspace. As long as the resource group name and the workspace name are identical to your existing workspace, this command updates the existing workspace rather than creating a new workspace.

    az deployment group create --resource-group <existing-resource-group-name>  \
                               --template-file <file-name>.json \
                               --parameters workspaceName=<existing-workspace-name> \
                               keyvaultUri=<keyvaultUrl> \
                               keyName=<keyName> keyVersion=<keyVersion>
    

    Important

    • Other than changes in the key-related parameters, use the same parameters that were used for creating the workspace.
    • After you run the key rotation command, you must keep your old KMS key available to Azure Databricks for 24 hours.

Use the Azure portal to create or update a workspace

To use the template in the Azure portal to create or update a workspace:

  1. Go to the Custom deployment page.

  2. Click Build your own template in the editor.

  3. Paste in the JSON.

  4. Click Save.

  5. Fill in the parameters.

    To update an existing workspace, use the same parameters that you used to create the workspace. To add a key for the first time, add the three key-related parameters. To rotate the key, change some or all of the key-related parameters. Ensure the resource group name and the workspace name are identical to your existing workspace. If they are the same, this command updates the existing workspace rather than creating a new workspace.

    Important

    Other than changes in the key-related parameters, use the same parameters that were used for creating the workspace.

  6. Click Review + Create.

  7. If there are no validation issues, click Create.

    Important

    After you run the key rotation command, you must keep your old KMS key available to Azure Databricks for 24 hours.

For more details, see the Azure article Quickstart: Create and deploy ARM templates by using the Azure portal.

Step 4 (optional): Export and re-import existing notebooks

After you initially add a key for managed services for an existing workspace, only future write operations use your key. Existing data is not re-encrypted.

You can export all notebooks and then re-import them so the key that encrypts the data is protected and controlled by your key. You can use the Export and Import Workspace APIs.

Rotate the key

If you are already using a customer-managed key for managed services, you can update the workspace with a new key version, or an entirely new key. This is called key rotation.

  1. Create a new key or rotate your existing key in the Key Vault. See Step 1: Set up a Key Vault.

    Important

    Ensure the new key has the proper permission.

  2. Confirm that your template has the correct API version. It must be equal to or higher than 2021-04-01-preview.

  3. Update the workspace:

    Important

    After you run the key rotation command, you must keep your old KMS key available to Azure Databricks for 24 hours.

    • To use the Azure portal, apply the template using the Custom deployment tool. See Use the Azure portal to create or update a workspace. Ensure that you use the same values for the resource group name and the workspace name so it updates the existing workspace, rather than creating a new workspace.

    • To use the Azure CLI, run the following command. Ensure that you use the same values for the resource group name and the workspace name so it updates the existing workspace, rather than creating a new workspace.

      Important

      Other than changes in the key-related parameters, use the same parameters that were used for creating the workspace.

      az deployment group create --resource-group <existing-resource-group-name>  \
                                 --template-file <file-name>.json \
                                 --parameters workspaceName=<existing-workspace-name> \
                                 keyvaultUri=<keyvaultUrl> \
                                 keyName=<keyName> keyVersion=<keyVersion>
      
  4. Optionally export and re-import existing notebooks to ensure all existing notebooks use your new key.

Troubleshooting

Accidental deletion of a key

If you delete your key in the Azure Key Vault, the workspace login will start failing and no notebooks will be readable by Azure Databricks. To avoid this, we recommend that you enable soft deletes. This option ensures that if a key is deleted, it can be recovered within a 30 day period. If soft delete is enabled, you can simply re-enable the key in order to resolve the issue.

Lost keys are unrecoverable

If you lose your key and cannot recover, all the notebook data encrypted by the key is unrecoverable.

Key update failure due to Key Vault permissions

If you have trouble creating your workspace, check if your Key Vault has correct permissions. The error that is returned from Azure may not correctly indicate this as the root cause. Also, the required permissions are get, wrapKey, and unwrapKey. See Step 1: Set up a Key Vault.