Configure double encryption for DBFS root
Note
This feature is available only in the Premium plan.
Databricks File System (DBFS) is a distributed file system mounted into an Azure Databricks workspace and available on Azure Databricks clusters. DBFS is implemented as a storage account in your Azure Databricks workspace’s managed resource group. The default location in DBFS is known as the DBFS root.
Azure Storage automatically encrypts all data in the workspace storage account, including DBFS root storage, at the service level using 256-bit AES encryption. This is one of the strongest block ciphers available and is FIPS 140-2 compliant. If you require higher levels of assurance that your data is secure, you can also enable 256-bit AES encryption at the Azure Storage infrastructure level. When infrastructure encryption is enabled, data in a storage account is encrypted twice, once at the service level and once at the infrastructure level, with two different encryption algorithms and two different keys. Double encryption of Azure Storage data protects against a scenario where one of the encryption algorithms or keys is compromised. In this scenario, the additional layer of encryption continues to protect your data.
This article describes how to create a workspace that adds infrastructure encryption (and therefore double encryption) for a workspace storage account. You must enable infrastructure encryption at workspace creation; you cannot add infrastructure encryption to an existing workspace.
Requirements
Create a workspace with double encryption using the Azure portal
Follow the instructions for creating a workspace using the Azure portal in Quickstart: Run a Spark job on Azure Databricks Workspace using the Azure portal, adding these steps:
In PowerShell, run the following commands, which will allow you to enable infrastructure encryption in the Azure portal.
Register-AzProviderFeature -ProviderNamespace Microsoft.Storage -FeatureName AllowRequireInfraStructureEncryption Get-AzProviderFeature -ProviderNamespace Microsoft.Storage -FeatureName AllowRequireInfraStructureEncryption
On the Create an Azure Databricks workspace page (Create a resource > Analytics > Azure Databricks), click the Advanced tab.
Next to Enable Infrastructure Encryption, select Yes.
When you have finished your workspace configuration and created the workspace, verify that infrastructure encryption is enabled.
In the resource page for the Azure Databricks workspace, go to the sidebar menu and select Settings > Encryption. Confirm that Enable Infrastructure Encryption is selected.
Create a workspace with double encryption using PowerShell
Follow the instructions in Quickstart: Create an Azure Databricks workspace using PowerShell, adding the option -RequireInfrastructureEncryption
to the command you run in the Create an Azure Databricks workspace step:
For example,
New-AzDatabricksWorkspace -Name databricks-test -ResourceGroupName testgroup -Location eastus -ManagedResourceGroupName databricks-group -Sku premium -RequireInfrastructureEncryption
After your workspace is created, verify that infrastructure encryption is enabled by running:
Get-AzDatabricksWorkspace -Name <workspace-name> -ResourceGroupName <resource-group> | fl
RequireInfrastructureEncryption
should be set to true
.
For more information about PowerShell cmdlets for Azure Databricks workspaces, see the Az.Databricks module reference.
Create a workspace with double encryption using the Azure CLI
When you create a workspace using the Azure CLI, include the option --require-infrastructure-encryption
.
For example,
az databricks workspace create --name <workspace-name> --location <workspace-location> --resource-group <resource-group> --sku premium --require-infrastructure-encryption
After your workspace is created, verify that infrastructure encryption is enabled by running:
az databricks workspace show --name <workspace-name> --resource-group <resource-group>
The requireInfrastructureEncryption
field should be present in the encryption property and set to true
.
For more information about Azure CLI commands for Azure Databricks workspaces, see the az databricks workspace command reference.