Authentication for Azure Databricks automation - overview
In Azure Databricks, authentication refers to verifying an Azure Databricks identity (such as a user, service principal, or group), or an Azure managed identity. Azure Databricks uses credentials (such as an access token) to verify the identity.
After Azure Databricks verifies the caller’s identity, Azure Databricks then uses a process called authorization to determine whether the verified identity has sufficient access permissions to perform the specified action on the resource at the given location. This article includes details only about authentication. It does not include details about authorization or access permissions; see Authentication and access control.
When a tool makes an automation or API request, it includes credentials that authenticate an identity with Azure Databricks. This article describes typical ways to create, store, and pass credentials and related information that Azure Databricks needs to authenticate and authorize requests. To learn which credential types, related information, and storage mechanism are supported by your tools, SDKs, scripts, and apps, see Supported authentication types by Azure Databricks tool or SDK or your provider’s documentation.
Common tasks for Azure Databricks authentication
Use the following instructions to complete common tasks for Azure Databricks authentication.
To complete this task… | Follow the instructions in this article |
---|---|
Create an Azure Databricks user that you can use for authenticating at the Azure Databricks account level. | Manage users in your account |
Create an Azure Databricks user that you can use for authenticating with a specific Azure Databricks workspace. | Manage users in your workspace |
Create an Azure Databricks personal access token for an Azure Databricks user. (This Azure Databricks personal access token can be used only for authenticating with its associated Azure Databricks workspace.) | Azure Databricks personal access tokens for workspace users |
Create an Azure Databricks managed service principal, and then add that Azure Databricks managed service principal to an Azure Databricks account, a specific Azure Databricks workspace, or both. You can then use this service principal for authenticating at the Azure Databricks account level, with a specific Azure Databricks workspace, or both. | Manage service principals |
Create an Azure Databricks configuration profile. | Azure Databricks configuration profiles |
Create an Azure Databricks group, and add Azure Databricks users and Azure service principals to that group, for more robust authorization. | Manage account groups using the account console, Manage account groups using the workspace admin settings page |
Supported Azure Databricks authentication types
Azure Databricks provides several ways to authenticate Azure Databricks users, service principals, and Azure managed identities, as follows:
Authentication type | Details |
---|---|
Azure managed identities authentication | * Azure managed identities authentication uses managed identities for Azure resources for authentication. See What are managed identities for Azure resources?. * Azure managed identities use Microsoft Entra ID tokens for authentication credentials. These tokens are managed internally within Microsoft systems. You cannot access these tokens. * Azure managed identities authentication must be initiated from a resource that supports Azure managed identities, such as an Azure virtual machine (Azure VM). * For additional technical details, see Azure managed identities authentication. |
OAuth machine-to-machine (M2M) authentication | * OAuth M2M authentication uses service principals for authentication. It can be used with Azure Databricks managed service principals or Microsoft Entra ID managed service principals. * OAuth M2M authentication uses short-lived (one hour) Azure Databricks OAuth access tokens for authentication credentials. * Expired Azure Databricks OAuth access tokens can be automatically refreshed by participating Azure Databricks tools and SDKs. See Supported authentication types by Azure Databricks tool or SDK and Databricks client unified authentication. * Databricks recommends that you use OAuth M2M authentication for unattended authentication scenarios. These scenarios include fully automated and CI/CD workflows, where you cannot use your web browser to authenticate with Azure Databricks in real time. * Databricks recommends that you use Azure managed identities authentication, if your target Azure Databricks tool or SDK supports it, instead of OAuth M2M authentication. This is because Azure managed identities authentication does not expose credentials. * Databricks recommends that you use Microsoft Entra ID service principal authentication, instead of OAuth M2M authentication, in cases where you must use Microsoft Entra ID tokens for authentication credentials. For example, you might need to authenticate with Azure Databricks and other Azure resources at the same time, which requires Microsoft Entra ID tokens. * For additional technical details, see Use a service principal to authenticate with Azure Databricks. |
OAuth user-to-machine (U2M) authentication | * OAuth U2M authentication uses Azure Databricks users for authentication. * OAuth U2M authentication uses short-lived (one hour) Azure Databricks OAuth access tokens for authentication credentials. * Participating Azure Databricks tools and SDKs can automatically refresh expired OAuth access tokens. See Supported authentication types by Azure Databricks tool or SDK and Databricks client unified authentication. * OAuth U2M authentication is suitable for attended authentication scenarios. These scenarios include manual and rapid development workflows, where you use your web browser to authenticate with Azure Databricks in real time, when prompted. * Databricks recommends that you use Azure managed identities authentication, if your target Azure Databricks tool or SDK supports it, instead of OAuth U2M authentication. This is because Azure managed identities authentication does not expose credentials. * For additional technical details, see OAuth user-to-machine (U2M) authentication. |
Microsoft Entra ID service principal authentication | * Microsoft Entra ID service principal authentication uses Microsoft Entra ID service principals for authentication. It cannot be used with Azure Databricks managed service principal. * Microsoft Entra ID service principal authentication uses short-lived (typically one hour) Microsoft Entra ID tokens for authentication credentials. * Expired Microsoft Entra ID tokens can be automatically refreshed by participating Azure Databricks tools and SDKs. See Supported authentication types by Azure Databricks tool or SDK and Databricks client unified authentication. * Databricks recommends that you use Azure managed identities authentication, if your target Azure Databricks tool or SDK supports it, instead of Microsoft Entra ID service principal authentication. This is because Azure managed identities authentication does not expose credentials. * If you cannot use Azure managed identities authentication, Databricks recommends that you use OAuth M2M authentication instead of Microsoft Entra ID service principal authentication. * Databricks recommends that you use Microsoft Entra ID service principal authentication in cases where you must use Microsoft Entra ID tokens for authentication credentials. For example, you might need to authenticate with Azure Databricks and other Azure resources at the same time, which requires Microsoft Entra ID tokens. * For additional technical details, see Microsoft Entra ID service principal authentication. |
Azure CLI authentication | * Azure CLI authentication uses the Azure CLI along with Azure Databricks users or Microsoft Entra ID managed service principals for authentication. * Azure CLI authentication uses short-lived (typically one hour) Microsoft Entra ID tokens for authentication credentials. * Participating Azure Databricks tools and SDKs can automatically refresh expired Microsoft Entra ID tokens. SDKs. See Supported authentication types by Azure Databricks tool or SDK and Databricks client unified authentication. * Databricks recommends Azure managed identities authentication, if your target Azure Databricks tool or SDK supports it, instead of Azure CLI authentication. Azure managed identities authentication uses Azure managed identities instead of Azure Databricks users or Microsoft Entra ID managed service principals, and Azure managed identities are more secure than Azure Databricks users or Microsoft Entra ID managed service principals, as Azure managed identities authentication does not expose credentials. See What are managed identities for Azure resources?. * Databricks recommends that you use Azure CLI authentication in cases where you must use Microsoft Entra ID tokens for authentication credentials. For example, you might need to authenticate with Azure Databricks and other Azure resources at the same time, which requires Microsoft Entra ID tokens. * Azure CLI authentication authentication is suitable for attended authentication scenarios. These scenarios include manual and rapid development workflows, where you use the Azure CLI to authenticate with Azure Databricks in real time. * For additional technical details, see Azure CLI authentication. |
Azure Databricks personal access token authentication | * Azure Databricks personal access token authentication uses Azure Databricks users for authentication. * Azure Databricks personal access token authentication uses short-lived or long-lived strings for authentication credentials. These access tokens can be set to expire in as short as one day or less, or they can be set to never expire. * Expired Azure Databricks personal access tokens cannot be refreshed. * Databricks does not recommend Azure Databricks personal access tokens (especially long-lived access tokens) for authentication credentials, as they are less secure than Microsoft Entra ID or Azure Databricks OAuth access tokens. * Databricks recommends Azure managed identities authentication, if your target Azure Databrickstool or SDK supports it, instead of Azure Databricks personal access token authentication. Azure managed identities authentication uses Azure managed identities instead of Azure Databricks users, and Azure managed identities are more secure than Azure Databricks users. See What are managed identities for Azure resources?. * If you cannot use Azure managed identities authentication, Databricks recommends that you use Azure CLI authentication instead of Azure Databricks personal access token authentication. * For additional technical details, see Azure Databricks personal access token authentication. |
Supported authentication types by Azure Databricks tool or SDK
Azure Databricks tools and SDKs that work with one or more supported Azure Databricks authentication types include the following:
Azure Databricks account and workspace REST APIs
Databricks organizes its Databricks REST API into two categories of APIs: account APIs and workspace APIs. Each of these categories requires different sets of information to authenticate the target Azure Databricks identity. Also, each supported Databricks authentication type requires additional information that uniquely identifies the target Azure Databricks identity.
For instance, to authenticate an Azure Databricks identity for calling Azure Databricks account-level API operations, you must provide:
- The target Azure Databricks account console URL, which is typically
https://accounts.azuredatabricks.net
. - The target Azure Databricks account ID. See Locate your account ID.
- Information that uniquely identifies the target Azure Databricks identity for the target Databricks authentication type. For the specific information to provide, see the section later in this article for that authentication type.
To authenticate an Azure Databricks identity for calling Azure Databricks workspace-level API operations, you must provide:
- The target Azure Databricks per-workspace URL, for example
https://adb-1234567890123456.7.azuredatabricks.net
. - Information that uniquely identifies the target Azure Databricks identity for the target Databricks authentication type. For the specific information to provide, see the section later in this article for that authentication type.
Databricks client unified authentication
Databricks provides a consolidated and consistent architectural and programmatic approach to authentication, known as Databricks client unified authentication. This approach helps make setting up and automating authentication with Databricks more centralized and predictable. It enables you to configure Databricks authentication once and then use that configuration across multiple Databricks tools and SDKs without further authentication configuration changes.
Participating Databricks tools and SDKs include:
- The Databricks CLI
- The Databricks Terraform provider
- Databricks Connect
- The Databricks extension for Visual Studio Code
- The Databricks SDK for Python
- The Databricks SDK for Java
- The Databricks SDK for Go
All participating tools and SDKs accept special environment variables and Azure Databricks configuration profiles for authentication. The Databricks Terraform provider and the Databricks SDKs for Python, Java, and Go also accept direct configuration of authentication settings within code. For details, see Supported authentication types by Azure Databricks tool or SDK or the tool’s or SDK’s documentation.
Default order of evaluation for client unified authentication methods and credentials
Whenever a participating tool or SDK needs to authenticate with Azure Databricks, the tool or SDK tries the following types of authentication in the following order by default. When the tool or SDK succeeds with the type of authentication that it tries, the tool or SDK stops trying to authenticate with the remaining authentication types. To force an SDK to authenticate with a specific authentication type, set the Config
API’s Databricks authentication type field.
- Azure Databricks personal access token authentication
- Use a service principal to authenticate with Azure Databricks
- OAuth user-to-machine (U2M) authentication
- Azure managed identities authentication
- Microsoft Entra ID service principal authentication
- Azure CLI authentication
For each authentication type that the participating tool or SDK tries, the tool or SDK tries to find authentication credentials in the following locations, in the following order. When the tool or SDK succeeds in finding authentication credentials that can be used, the tool or SDK stops trying to find authentication credentials in the remaining locations.
- Credential-related
Config
API fields (for SDKs). To setConfig
fields, see Supported authentication types by Azure Databricks tool or SDK or the SDK’s reference documentation. - Credential-related environment variables. To set environment variables, see Supported authentication types by Azure Databricks tool or SDK and your operating system’s documentation.
- Credential-related fields in the
DEFAULT
configuration profile within the.databrickscfg
file. To set configuration profile fields, see Supported authentication types by Azure Databricks tool or SDK and (#config-profiles). - Any related authentication credentials that are cached by the Azure CLI. See Azure CLI.
To provide maximum portability for your code, Databricks recommends that you create a custom configuration profile within the .databrickscfg
file, add the required fields for your target Databricks authentication type to the custom configuration profile, and then set the DATABRICKS_CONFIG_PROFILE
environment variable to the name of the custom configuration profile. For more information, see Supported authentication types by Azure Databricks tool or SDK.
Environment variables and fields for client unified authentication
The following tables list the names and descriptions of the supported environment variables and fields for Databricks client unified authentication. In the following tables:
- Environment variable, where applicable, is the name of the environment variable. To set environment variables, see Supported authentication types by Azure Databricks tool or SDK and your operating system’s documentation.
.databrickscfg
field, where applicable, is the name of the field within an Azure Databricks configuration profiles file or Databricks Terraform configuration. To set.databrickscfg
fields, see Supported authentication types by Azure Databricks tool or SDK and Azure Databricks configuration profiles.- Terraform field, where applicable, is the name of the field within a Databricks Terraform configuration. To set Databricks Terraform fields, see Authentication in the Databricks Terraform provider documentation.
Config
field is the name of the field within theConfig
API for the specified SDK. To use theConfig
API, see Supported authentication types by Azure Databricks tool or SDK or the SDK’s reference documentation.
General host, token, and account ID environment variables and fields
Common name | Description | Environment variable | .databrickscfg field, Terraform field |
Config field |
---|---|---|---|---|
Azure Databricks host | (String) The Azure Databricks host URL for either the Azure Databricks workspace endpoint or the Azure Databricks accounts endpoint. | DATABRICKS_HOST |
host |
host (Python),setHost (Java),Host (Go) |
Azure Databricks token | (String) The Azure Databricks personal access token or Microsoft Entra ID token. | DATABRICKS_TOKEN |
token |
token (Python),setToken (Java),Token (Go) |
Azure Databricks account ID | (String) The Azure Databricks account ID for the Azure Databricks account endpoint. Only has effect when the Azure Databricks host is also set tohttps://accounts.azuredatabricks.net . |
DATABRICKS_ACCOUNT_ID |
account_id |
account_id (Python),setAccountID (Java),AccountID (Go) |
Azure-specific environment variables and fields
Common name | Description | Environment variable | .databrickscfg field, Terraform field |
Config field |
---|---|---|---|---|
Azure client ID | (String) The Microsoft Entra ID service principal’s application ID. Use with Azure managed identities authentication and Microsoft Entra ID service principal authentication. | ARM_CLIENT_ID |
azure_client_id |
azure_client_id (Python),setAzureClientID (Java),AzureClientID (Go) |
Azure client secret | (String) The Microsoft Entra ID service principal’s client secret. Use with a Microsoft Entra ID service principal authentication. | ARM_CLIENT_SECRET |
azure_client_secret |
azure_client_secret (Python),setAzureClientSecret (Java),AzureClientSecret (Go) |
Client ID | (String) The client ID of the Azure Databricks managed service principal or Microsoft Entra ID managed service principal. Use with OAuth M2M authentication. | DATABRICKS_CLIENT_ID |
client_id |
client_id (Python),setClientId (Java),ClientId (Go) |
Client secret | (String) The client secret of the Azure Databricks managed service principal or Microsoft Entra ID managed service principal. Use with OAuth M2M authentication. | DATABRICKS_CLIENT_SECRET |
client_secret |
client_secret (Python),setClientSecret (Java),ClientSecret (Go) |
Azure environment | (String) The Azure environment type. Defaults to PUBLIC . |
ARM_ENVIRONMENT |
azure_environment |
azure_environment (Python),setAzureEnvironment (Java),AzureEnvironment (Go) |
Azure tenant ID | (String) The Microsoft Entra ID service principal’s tenant ID. | ARM_TENANT_ID |
azure_tenant_id |
azure_tenant_id (Python),setAzureTenantID (Java),AzureTenantID (Go) |
Azure use MSI | (Boolean) True to use Azure Managed Service Identity passwordless authentication flow for service principals. Requires the Azure resource ID to also be set. | ARM_USE_MSI |
azure_use_msi |
AzureUseMSI (Go) |
Azure resource ID | (String) The Azure Resource Manager ID for the Azure Databricks workspace. | DATABRICKS_AZURE_RESOURCE_ID |
azure_workspace_resource_id |
azure_workspace_resource_id (Python),setAzureResourceID (Java),AzureResourceID (Go) |
.databrickscfg-specific environment variables and fields
Use these environment variables or fields to specify non-default settings for .databrickscfg
. See also Azure Databricks configuration profiles.
Common name | Description | Environment variable | Terraform field | Config field |
---|---|---|---|---|
.databrickscfg file path |
(String) A non-default path to the.databrickscfg file. |
DATABRICKS_CONFIG_FILE |
config_file |
config_file (Python),setConfigFile (Java),ConfigFile (Go) |
.databrickscfg default profile |
(String) The default named profile to use, other than DEFAULT . |
DATABRICKS_CONFIG_PROFILE |
profile |
profile (Python),setProfile (Java),Profile (Go) |
Authentication type field
Use this environment variable or field to force an SDK to use a specific type of Databricks authentication.
Common name | Description | Terraform field | Config field |
---|---|---|---|
Databricks authentication type | (String) When multiple authentication attributes are available in the environment, use the authentication type specified by this argument. | auth_type |
auth_type (Python),setAuthType (Java),AuthType (Go) |
Supported Databricks authentication type field values include:
oauth-m2m
: Use a service principal to authenticate with Azure Databricksdatabricks-cli
: (/dev-tools/auth/oauth-u2m.md)azure-msi
: Azure managed identities authenticationazure-client-secret
: Microsoft Entra ID service principal authenticationazure-cli
: Azure CLI authentication
Azure Databricks configuration profiles
An Azure Databricks configuration profile (sometimes referred to as a configuration profile, a config profile, or simply a profile
) contains settings and other information that Azure Databricks needs to authenticate. Azure Databricks configuration profiles are stored in Azure Databricks configuration profiles files for your tools, SDKs, scripts, and apps to use. To learn whether Azure Databricks configuration profiles are supported by your tools, SDKs, scripts, and apps, see your provider’s documentation. All participating tools and SDKs that implement Databricks client unified authentication support Azure Databricks configuration profiles. For more information, see Supported authentication types by Azure Databricks tool or SDK.
To create an Azure Databricks configuration profiles file:
Use your favorite text editor to create a file named
.databrickscfg
in your~
(your user home) folder on Unix, Linux, or macOS, or your%USERPROFILE%
(your user home) folder on Windows, if you do not already have one. Do not forget the dot (.
) at the beginning of the file name. Add the following contents to this file:[<some-unique-name-for-this-configuration-profile>] <field-name> = <field-value>
In the preceding contents, replace the following values, and then save the file:
<some-unique-name-for-this-configuration-profile>
with a unique name for the configuration profile, such asDEFAULT
,DEVELOPMENT
,PRODUCTION
, or similar. You can have multiple configuration profiles in the same.databrickscfg
file, but each configuration profile must have a unique name within this file.<field-name>
and<field-value>
with the name and a value for one of the required fields for the target Databricks authentication type. For the specific information to provide, see the section earlier in this article for that authentication type.- Add a
<field-name>
and<field-value>
pair for each of the additional required fields for the target Databricks authentication type.
For example, for Azure Databricks personal access token authentication, the .databrickscfg
file might look like this:
[DEFAULT]
host = https://adb-1234567890123456.7.azuredatabricks.net
token = dapi123...
To create additional configuration profiles, specify different profile names within the same .databrickscfg
file. For example, to specify separate Azure Databricks workspaces, each with their own Azure Databricks personal access token:
[DEFAULT]
host = https://adb-1234567890123456.7.azuredatabricks.net
token = dapi123...
[DEVELOPMENT]
host = https://adb-2345678901234567.8.azuredatabricks.net
token = dapi234...
You can also specify different profile names within the .databrickscfg
file for Azure Databricks accounts and different Databricks authentication types, for example:
[DEFAULT]
host = https://adb-1234567890123456.7.azuredatabricks.net
token = dapi123...
[DEVELOPMENT]
azure_workspace_resource_id = /subscriptions/bc0cd1.../resourceGroups/my-resource-group/providers/Microsoft.Databricks/workspaces/my-workspace
azure_tenant_id = bc0cd1...
azure_client_id = fa0cd1...
azure_client_secret = aBC1D~...
ODBC DSNs
In ODBC, a data source name (DSN) is a symbolic name that tools, SDKs, scripts, and apps use to request a connection to an ODBC data source. A DSN stores connection details such as the path to an ODBC driver, networking details, authentication credentials, and database details. To learn whether ODBC DSNs are supported by your tools, scripts, and apps, see your provider’s documentation.
To install and configure the Databricks ODBC Driver and create an ODBC DSN for Azure Databricks, see Databricks ODBC Driver.
JDBC connection URLs
In JDBC, a connection URL is a symbolic URL that tools, SDKs, scripts, and apps use to request a connection to a JDBC data source. A connection URL stores connection details such as networking details, authentication credentials, database details, and JDBC driver capabilities. To learn whether JDBC connection URLs are supported by your tools, SDKs, scripts, and apps, see your provider’s documentation.
To install and configure the Databricks JDBC Driver and create a JDBC connection URL for Azure Databricks, see Databricks JDBC Driver.
Microsoft Entra ID (formerly Azure Active Directory) tokens
Microsoft Entra ID (formerly Azure Active Directory) tokens are one of the most well-supported types of credentials for Azure Databricks, both at the Azure Databricks workspace and account levels.
Note
Some tools, SDKs, scripts, and apps only support Azure Databricks personal access token authentication and not Microsoft Entra ID tokens. To learn whether Microsoft Entra ID tokens are supported by your tools, SDKs, scripts, and apps, see Supported authentication types by Azure Databricks tool or SDK or your provider’s documentation.
Also, some tools, SDK, scripts, and apps support Azure Databricks OAuth tokens in addition to, or instead of, Microsoft Entra ID tokens for Azure Databricks authentication. To learn whether Azure Databricks OAuth tokens are supported by your tools, SDKs, scripts, and apps, see Supported authentication types by Azure Databricks tool or SDK or your provider’s documentation.
Microsoft Entra ID token authentication for users
Databricks does not recommend that you create Microsoft Entra ID tokens for Azure Databricks users manually. This is because each Microsoft Entra ID token is short-lived, typically expiring within one hour. After this time, you must manually generate a replacement Microsoft Entra ID token. Instead, use one of the participating tools or SDKs that implement the Databricks client unified authentication standard. These tools and SDKs automatically generate and replace expired Microsoft Entra ID tokens for you, leveraging Azure CLI authentication.
If you must manually create a Microsoft Entra ID token for an Azure Databricks user, see:
- Get Microsoft Entra ID (formerly Azure Active Directory) tokens for users by using the Azure CLI
- Get Microsoft Entra ID (formerly Azure Active Directory) tokens for users by using MSAL
Microsoft Entra ID token authentication for service principals
Databricks does not recommend that you create Microsoft Entra ID tokens for Microsoft Entra ID service principals manually. This is because each Microsoft Entra ID token is short-lived, typically expiring within one hour. After this time, you must manually generate a replacement Microsoft Entra ID token. Instead, use one of the participating tools or SDKs that implement the Databricks client unified authentication standard. These tools and SDKs automatically generate and replace expired Microsoft Entra ID tokens for you, leveraging the following Databricks authentication types:
- Azure managed identities authentication
- Microsoft Entra ID service principal authentication
- Azure CLI authentication
If you must manually create a Microsoft Entra ID token for a Microsoft Entra ID service principal, see:
- Get a Microsoft Entra ID access token with the Microsoft identity platform REST API
- Get a Microsoft Entra ID access token with the Azure CLI
Azure CLI
The Azure CLI enables you to authenticate with Azure Databricks through PowerShell, through your terminal for Linux or macOS, or through your Command Prompt for Windows. To learn whether the Azure CLI is supported by your tools, SDKs, scripts, and apps, see Supported authentication types by Azure Databricks tool or SDK or your provider’s documentation.
To use the Azure CLI to authenticate with Azure Databricks manually, run the az login command:
az login
To authenticate by using an Microsoft Entra ID service principal, see Azure CLI login with a Microsoft Entra ID service principal.
To authenticate in by using an Azure managed an Azure Databricks user account, see Azure CLI login with an Azure Databricks user account.
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for