Enable workload identity federation for AWS IAM workloads

AWS workloads can authenticate to Azure Databricks without long-term secrets by exchanging an AWS-signed OIDC identity token. The recommended path calls the AWS STS GetWebIdentityToken API, which works anywhere the workload has AWS credentials.

Use cases

  • Lambda functions calling Azure Databricks APIs (triggering jobs, querying SQL warehouses)
  • EC2/ECS-based ETL pipelines authenticating to Azure Databricks without secrets
  • EKS-based ML workloads accessing model serving endpoints
  • Cross-account patterns where workloads in one AWS account federate into a Azure Databricks account managed by another team
  • Security posture improvement by eliminating long-lived Azure Databricks PATs or secrets from AWS Secrets Manager

AWS prerequisites

The following steps must be completed to enable workload identity federation for AWS IAM workloads.

Step 1: Enable AWS IAM outbound identity federation

Enable outbound identity federation on your AWS account:

import boto3
boto3.client('iam').enable_outbound_web_identity_federation()

You can also enable this in the IAM Console under Account Settings > Enable "Outbound web identity federation".

Step 2: Grant sts:GetWebIdentityToken permission

Grant the workload's IAM role sts:GetWebIdentityToken permission:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["sts:GetWebIdentityToken"],
      "Resource": "*",
      "Condition": {
        "ForAllValues:StringEquals": {
          "sts:IdentityTokenAudience": "databricks"
        },
        "NumericLessThanEquals": {
          "sts:DurationSeconds": 300
        }
      }
    }
  ]
}

Note

The audience condition ensures the role can only request tokens targeted at Azure Databricks. The duration condition limits token lifetime to 300 seconds. You can adjust the duration up to 3600 seconds based on your workload needs, but shorter lifetimes are recommended.

Step 3: Note the account-specific issuer URL

Retrieve the account-specific issuer URL:

import boto3
info = boto3.client('iam').get_outbound_web_identity_federation_info()
print(info['IssuerUrl'])  # https://<uuid>.tokens.sts.global.api.aws

Create a federation policy

Important

Azure Databricks federation policies are created at the account level (not workspace level). The Azure Databricks CLI host must be set to https://accounts.azuredatabricks.net and the user must be an account admin.

Create a workload identity federation policy using the Databricks CLI. Set the issuer to your account-specific issuer URL from Step 3. For detailed instructions, see Configure a service principal federation policy.

databricks account service-principal-federation-policy create ${SP_ID} --json '{
  "oidc_policy": {
    "issuer": "https://<uuid>.tokens.sts.global.api.aws",
    "audiences": ["databricks"],
    "subject": "arn:aws:iam::<account-id>:role/<workload-role-name>"
  }
}'

Authenticate to Databricks

After you create the federation policy, use the Databricks SDK to authenticate your AWS workloads. The following example uses the SDK's IdTokenSource pattern to retrieve an AWS STS token and exchange it for a Azure Databricks OAuth token.

import boto3
from databricks.sdk import WorkspaceClient
from databricks.sdk import oidc
from databricks.sdk.core import Config, credentials_strategy, oidc_credentials_provider


class AwsStsTokenSource(oidc.IdTokenSource):
    def __init__(self, audience="databricks", region="us-east-1"):
        self._audience = audience
        self._region = region

    def id_token(self) -> oidc.IdToken:
        sts = boto3.client("sts", region_name=self._region)
        resp = sts.get_web_identity_token(
            Audience=[self._audience],
            SigningAlgorithm="RS256",
            DurationSeconds=300,
        )
        return oidc.IdToken(jwt=resp["WebIdentityToken"])


@credentials_strategy("aws-sts-wif", [])
def aws_sts_wif_strategy(cfg: Config):
    return oidc_credentials_provider(cfg, AwsStsTokenSource())


w = WorkspaceClient(
    host="https://my-workspace.cloud.databricks.com",
    client_id="<service-principal-uuid>",
    credentials_strategy=aws_sts_wif_strategy
)
# No secrets needed
clusters = w.clusters.list()

Note

Token duration of 300 seconds is recommended. You can adjust up to 3600 seconds based on workload needs.

For a manual token exchange example, see Authenticate with an identity provider token.

AWS documentation references