Authentication for Databricks Asset Bundles
This article describes how to configure authentication for Databricks Asset Bundles. See What are Databricks Asset Bundles?.
You deploy and run Databricks Asset Bundles run within the context of two types of authentication scenarios: attended and unattended:
- Attended authentication scenarios are manual workflows, for example, using your web browser on your local machine to log in to your target Azure Databricks workspace when prompted by the Databricks CLI.
- Unattended authentication scenarios are automated and CI/CD workflows, for example when using CI/CD systems such as GitHub.
The following sections recommend the Azure Databricks authentication types and settings to use for Databricks Asset Bundles, based on these two types of authentication scenarios.
Attended authentication
For attended authentication scenarios with Databricks Asset Bundles, Databricks recommends that you use the following Azure Databricks authentication types, in the following order of preference:
- OAuth user-to-machine (U2M) authentication for your Azure Databricks user account in the target workspace.
- Azure Databricks personal access token authentication for a token that is associated with your Azure Databricks user account for the target workspace.
For more information about these Azure Databricks authentication types, see Supported Azure Databricks authentication types.
For storing authentication settings for attended authentication scenarios, Databricks recommends that you use Azure Databricks configuration profiles on your local development machine. Configuration profiles enable you to quickly switch among different Azure Databricks authentication contexts to do rapid local development among multiple Azure Databricks workspaces. With profiles, you can use the --profile
or -p
options to specify a particular profile when running the bundle validate, deploy, run, and destroy commands with the Databricks CLI.
Databricks supports but does not recommend, the use of the profile
mapping within the workspace mapping to specify the profile to use for each target workspace in your bundle configuration files. Hard-coded mappings make your bundle configuration files less reusable across projects.
Unattended authentication
For unattended authentication scenarios with Databricks Asset Bundles, Databricks recommends that you use the following Azure Databricks authentication types, in the following order of preference:
- Azure managed identities authentication with an Azure managed identity registered with an Azure virtual machine, if this setup is supported by your CI/CD system.
- OAuth machine-to-machine (M2M) authentication for a Azure Databricks managed service principal in the target workspace.
- Microsoft Entra ID service principal authentication for a Microsoft Entra ID managed service principal in the target workspace.
For more information about these Azure Databricks authentication types, see Supported Azure Databricks authentication types.
For unattended authentication scenarios, Databricks recommends using environment variables to store Azure Databricks authentication settings in your target CI/CD system. This is because CI/CD systems are typically optimized to work with authentication settings stored in environment variables. These CI/CD systems often don’t work with other approaches, such as Azure Databricks configuration profiles, or they might work with profiles in unexpected or insecure ways.
For Databricks Asset Bundles projects used in CI/CD systems designed to work with multiple Azure Databricks workspaces (for example, three separate but related development, staging, and production workspaces), Azure Databricks recommends that you use service principals for authentication and that you give one service principal access to all participating workspaces. This enables you to use the same environment variables across all of the project’s workspaces without frequently changing those variables’ original settings.
Databricks supports but does not recommend, the use of hard-coded, authentication-related settings in the workspace mapping for target workspaces in your bundle configuration files. Hard-coded settings make your bundles configuration less reusable across projects and risk unnecessarily exposing sensitive information such as service principal IDs.
For unattended authentication scenarios, you must also install the Databricks CLI on the associated compute resources, as follows:
- For manual installation, see Install or update the Databricks CLI.
- For GitHub, see Run a CI/CD workflow with a Databricks Asset Bundle and GitHub Actions.
- For other CI/CD systems, see Install or update the Databricks CLI and your CI/CD system provider’s documentation.
Azure managed identities authentication
To set up Azure managed identities authentication, see Azure managed identities authentication.
The list of environment variables to set for unattended authentication is in the workspace-level operations coverage in the “Environment” section of Azure managed identities authentication. To set environment variables, see the documentation for your operating system or CI/CD system provider.
OAuth machine-to-machine (M2M) authentication
To set up OAuth M2M authentication, see OAuth machine-to-machine (M2M) authentication.
The list of environment variables to set for unattended authentication is in the workspace-level operations coverage of the “Environment” section of OAuth machine-to-machine (M2M) authentication. To set environment variables, see the documentation for your operating system or CI/CD system provider.
Microsoft Entra ID service principal authentication
To set up Microsoft Entra ID service principal authentication, see Microsoft Entra ID service principal authentication.
The list of environment variables to set for unattended authentication is in the workspace-level operations coverage in the “Environment” section of Microsoft Entra ID service principal authentication. To set environment variables, see the documentation for your operating system or CI/CD system provider.
Azure CLI authentication
To set up Azure CLI authentication, see Azure CLI authentication.
For attended authentication scenarios, to create an Azure Databricks configuration profile, see the “Profile” section in Azure CLI authentication.
OAuth user-to-machine (U2M) authentication
To set up OAuth U2M authentication, see the “CLI” section in OAuth user-to-machine (U2M) authentication.
For attended authentication scenarios, completing the instructions in the “CLI” section of OAuth user-to-machine (U2M) authentication automatically creates an Azure Databricks configuration profile for you.
Azure Databricks personal access token authentication
To create an Azure Databricks personal access token, see Azure Databricks personal access token authentication.
For attended authentication scenarios, to create an Azure Databricks configuration profile, see the “CLI” section in Azure Databricks personal access token authentication.
The list of environment variables to set for unattended authentication is in the workspace-level operations coverage in the “Environment” section of Azure Databricks personal access token authentication. To set environment variables, see the documentation for your operating system or CI/CD system provider.
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for