Authentication for Azure Databricks automation

In Azure Databricks, authentication refers to verifying an Azure Databricks identity (such as a user, service principal, or group). Azure Databricks uses credentials (such as an access token or a username and password) to verify the identity.

After Azure Databricks verifies the caller’s identity, Azure Databricks then uses a process called authorization to determine whether the verified identity has sufficient permission to perform the specified action on the resource at the given location. This article does not include details about authorization.

When a tool makes an automation or API request, it includes credentials that authenticate an identity with Azure Databricks. This article describes typical ways to create, store, and pass credentials and related information that Azure Databricks needs to authenticate and authorize requests. To learn which credential types, related information, and storage mechanism are supported by your tools, scripts, and apps, see your provider’s documentation.

Azure Databricks personal access tokens

Azure Databricks personal access tokens are one of the most well-supported types of credentials, especially for resources and operations at the Azure Databricks workspace level. Most storage mechanisms for credentials and related information, such as environment variables and configuration profiles, require an Azure Databricks personal access token. Although a Azure Databricks workspace can have multiple personal access tokens, each personal access token works for only a single Azure Databricks workspace.

Note

Azure Databricks supports Azure AD tokens in addition to Azure Databricks personal access tokens. To learn whether Azure AD tokens are supported by your tools, scripts, and apps, see your provider’s documentation.

To create an Azure Databricks personal access token for an Azure Databricks user, do the following:

  1. In your Azure Databricks workspace, click your Azure Databricks username in the top bar, and then select User Settings from the drop down.
  2. On the Access tokens tab, click Generate new token.
  3. (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).
  4. Click Generate.
  5. Copy the displayed token, and then click Done.

Important

Be sure to save the copied token in a secure location. If you lose the copied token, you cannot regenerate that exact same token. Instead, you must repeat this procedure to create a new token. If you lose the copied token, Databricks recommends that you immediately delete that token from your workspace by clicking the X next to the token on the Access tokens tab.

Managing personal access tokens

For information about enabling and disabling all Azure Databricks personal access tokens for a workspace, controlling who can use tokens in a workspace, setting a maximum lifetime for tokens in a workspace, and other token management operations for a workspace, see Manage personal access tokens.

Azure AD tokens

Azure Active Directory (Azure AD) tokens are one of the most well-supported types of credentials for Azure Databricks, especially for resources and operations at the Azure Databricks workspace level. Most storage mechanisms for credentials and related information for Azure Databricks, such as environment variables and configuration profiles, require an Azure AD token.

Note

Some tools, scripts, and apps only support Azure Databricks personal access tokens and not Azure AD tokens. To learn whether Azure AD tokens are supported by your tools, scripts, and apps, see your provider’s documentation.

Azure AD tokens for users

To create an Azure AD token for an Azure Databricks user, see:

Azure AD tokens for service principals

To create an Azure AD token for an Azure AD service principal instead of an Azure Databricks user, see:

Environment variables

Azure Databricks supported products, and a few third-party products that work with Azure Databricks, support some of the following unique environment variables. To learn which of these unique environment variables are supported by your tools, scripts, and apps, see your provider’s documentation. To create, change, and delete environment variables, see your operating system’s documentation.

Environment variable
ARM_CLIENT_ID

The client ID for an Azure AD enterprise application (service principal).

Applies to the Databricks Terraform provider only.
ARM_CLIENT_SECRET

The client secret for an Azure AD enterprise application (service principal).

Applies to the Databricks Terraform provider only.
ARM_ENVIRONMENT

The Azure environment identifier.

Allowed values: china, german, public, usgovernment

Default: public

Applies to the Databricks Terraform provider only.
ARM_TENANT_ID

The tenant ID for an Azure AD enterprise application (service principal).

Applies to the Databricks Terraform provider only.
ARM_USE_MSI

Whether to use Azure managed service identity authorization.

Applies to the Databricks Terraform provider only.
AZURE_SP_APPLICATION_ID

The client ID for an Azure AD enterprise application (service principal).

Applies to GitHub Actions developed for Databricks only.
AZURE_SP_CLIENT_SECRET

The client secret for an Azure AD enterprise application (service principal).

Applies to GitHub Actions developed for Databricks only.
AZURE_SP_TENANT_ID

The tenant ID for an Azure AD enterprise application (service principal).

Applies to GitHub Actions developed for Databricks only.
DATABRICKS_AAD_TOKEN

The value of an Azure AD token.

Applies to the Databricks CLI setup & documentation only.
DATABRICKS_ACCOUNT_ID

The ID of an Azure Databricks account.

Applies to the Databricks Terraform provider only.
DATABRICKS_ADDRESS

The URL to a Azure Databricks workspace.

Example: https://adb-1234567890123456.7.azuredatabricks.net

Applies to Databricks Connect only.
DATABRICKS_API_TOKEN

The value of an Azure Databricks personal access token or an Azure AD token.

Applies to Databricks Connect only.
DATABRICKS_CLUSTER_ID

The ID of an Azure Databricks cluster.

Applies to Databricks Connect only.
DATABRICKS_CONFIG_FILE

The full path to an Azure Databricks configuration profiles file.

Default: ~/.databrickscfg for Unix, Linux, and macOS; %USERPROFILE%\.databrickscfg for Windows
DATABRICKS_CONFIG_PROFILE

The name of an Azure Databricks configuration profile.

Default: DEFAULT
DATABRICKS_DEBUG_HEADERS

Whether debug HTTP headers of requests made by the provider are output.

Default: false

Applies to the Databricks Terraform provider only.
DATABRICKS_DEBUG_TRUNCATE_BYTES

Truncate the length of JSON fields in HTTP requests and responses above this limit.

Default: 96

Applies to the Databricks Terraform provider only.
DATABRICKS_DSN

The data source name (DSN) connection string to an Azure Databricks compute resource.

Applies to the Databricks SQL Driver for Go only.
DATABRICKS_HOST

The URL of a Azure Databricks workspace.

Example: https://adb-1234567890123456.7.azuredatabricks.net
DATABRICKS_ORG_ID

The organization ID of an Azure Databricks workspace.

Applies to Databricks Connect only.
DATABRICKS_PASSWORD

The password of an Azure Databricks workspace user.
DATABRICKS_PORT

The port number to communicate with an Azure Databricks cluster.

Applies to Databricks Connect only.
DATABRICKS_RATE_LIMIT

The maximum number of requests per second.

Default: 15

Applies to the Databricks Terraform provider only.
DATABRICKS_TOKEN

The value of an Azure Databricks personal access token or an Azure AD token.
DATABRICKS_USERNAME

The username of an Azure Databricks workspace user.
DBSQLCLI_ACCESS_TOKEN

The value of an Azure Databricks personal access token.

Applies to the Databricks SQL CLI only.
DBSQLCLI_HOST_NAME

The value of the Server hostname field for a Databricks SQL warehouse.

Example: adb-1234567890123456.7.azuredatabricks.net

Applies to the Databricks SQL CLI only.
DBSQLCLI_HTTP_PATH

The value of the HTTP path field for a Databricks SQL warehouse.

Example: /sql/1.0/warehouses/1abc2d3456e7f890a

Applies to the Databricks SQL CLI only.
PERSONAL_ACCESS_TOKEN

The value of an Azure Databricks personal access token.

Applies to the Apache Airflow integration with Databricks only.

Configuration profiles

An Azure Databricks configuration profile contains settings and other information that Azure Databricks needs to authenticate. Azure Databricks configuration profiles are stored in Azure Databricks configuration profiles files for your tools, scripts, and apps to use. To learn whether Azure Databricks configuration profiles are supported by your tools, scripts, and apps, see your provider’s documentation.

You can create a configuration profiles file by using the Databricks CLI, or you can create a configuration profiles file manually.

Use the Databricks CLI to create a configuration profiles file

To use the Databricks CLI to create a configuration profiles file, do the following. You can use either an Azure Databricks personal access token or an Azure AD token.

Note

This approach creates a configuration profiles file with a configuration profile named DEFAULT in the new file. If a configuration profiles file already exists, the file’s DEFAULT configuration profile is overwritten with the new data. To create a configuration profile with a different name, use the --profile option followed by a name for the new configuration profile, for example databricks configure --token --profile DEV or databricks configure --aad-token --profile DEV.

Use a personal access token

To create an Azure Databricks configuration profiles file that uses an Azure Databricks personal access token:

  1. Run the following command with the Databricks CLI:

    databricks configure --token
    
  2. When prompted, enter your per-workspace URL, for example https://adb-1234567890123456.7.azuredatabricks.net. Then press Enter.

  3. When prompted, enter your Azure Databricks personal access token, and then press Enter.

The Databricks CLI creates a file named .databrickscfg in your ~ (your user home) folder on Unix, Linux, or macOS, or your %USERPROFILE% (your user home) folder on Windows, if this file does not already exist. The Databricks CLI creates in this file contains a configuration profile named DEFAULT, if it does not already exist, and adds the information that you entered to this configuration profile.

Use an Azure AD token

To create an Azure Databricks configuration profiles file that uses an Azure Active Directory (Azure AD) token:

  1. Set the environment variable DATABRICKS_AAD_TOKEN to the value of your Azure AD token. To create, change, and delete environment variables, see your operating system’s documentation.

  2. Run the following command with the Databricks CLI:

    databricks configure --aad-token
    
  3. When prompted, enter your per-workspace URL, for example https://adb-1234567890123456.7.azuredatabricks.net. Then press Enter.

The Databricks CLI creates a file named .databrickscfg in your ~ (your user home) folder on Unix, Linux, or macOS, or your %USERPROFILE% (your user home) folder on Windows, if this file does not already exist. The Databricks CLI creates in this file contains a configuration profile named DEFAULT, if it does not already exist, and adds the information that you entered to this configuration profile.

Create a configuration profiles file manually

To manually create an Azure Databricks configuration profiles file:

  1. Use your favorite text editor to create a file named .databrickscfg in your ~ (your user home) folder on Unix, Linux, or macOS, or your %USERPROFILE% (your user home) folder on Windows. Do not forget the dot (.) at the beginning of the file name. Add the following contents to this file:

    [<DEFAULT>]
    host = <your-per-workspace-url>
    token = <your-personal-access-token-or-azure-ad-token>
    
  2. In the preceding contents, replace the following values, and then save the file:

    • <DEFAULT> with a unique name for the configuration profile, such as DEFAULT, DEV, PROD, or similar.
    • <your-per-workspace-url> with your per-workspace URL, for example https://adb-1234567890123456.7.azuredatabricks.net.
    • <your-personal-access-token-or-azure-ad-token> with your Azure Databricks personal access token or Azure Active Directory (Azure AD) token.

    For example, the .databrickscfg file might look like this:

    [DEFAULT]
    host = https://adb-1234567890123456.7.azuredatabricks.net
    token = dapi12345678901234567890123456789012
    

    Tip

    You can create additional configuration profiles by specifying different profile names within the same .databrickscfg file, for example:

    [DEFAULT]
    host = https://adb-1234567890123456.7.azuredatabricks.net
    token = dapi12345678901234567890123456789012
    
    [DEV]
    host = https://adb-2345678901234567.8.azuredatabricks.net)
    token = dapi23456789012345678901234567890123
    

ODBC DSNs

In ODBC, a data source name (DSN) is a symbolic name that tools, scripts, and apps use to request a connection to an ODBC data source. A DSN stores connection details such as the path to an ODBC driver, networking details, authentication credentials, and database details. To learn whether ODBC DSNs are supported by your tools, scripts, and apps, see your provider’s documentation.

To install and configure the Databricks ODBC Driver and create an ODBC DSN for Azure Databricks, see ODBC driver.

JDBC connection URLs

In JDBC, a connection URL is a symbolic URL that tools, scripts, and apps use to request a connection to a JDBC data source. A connection URL stores connection details such as networking details, authentication credentials, database details, and JDBC driver capabilities. To learn whether JDBC connection URLs are supported by your tools, scripts, and apps, see your provider’s documentation.

To install and configure the Databricks JDBC Driver and create a JDBC connection URL for Azure Databricks, see JDBC driver.

Azure CLI

The Azure CLI enables you to authenticate with Azure Databricks through PowerShell, through your terminal for Linux or macOS, or through your Command Prompt for Windows. To learn whether the Azure CLI is supported by your tools, scripts, and apps, see your provider’s documentation.

To use the Azure CLI to authenticate with Azure Databricks, run the az login command, and then follow the on-screen prompts:

az login

For more detailed authentication options, see Sign in with Azure CLI.