Load data using COPY INTO with temporary credentials

If your Azure Databricks cluster or SQL warehouse doesn’t have permissions to read your source files, you can use temporary credentials to access data from external cloud object storage and load files into a Delta Lake table.

Depending on how your organization manages your cloud security, you might need to ask a cloud administrator or power user to provide you with credentials. For more information, see Generate temporary credentials for ingestion.

Specifying temporary credentials or encryption options to access data

Note

Credential and encryption options are available in Databricks Runtime 10.2 and above.

COPY INTO supports:

  • Azure SAS tokens to read data from ADLS Gen2 and Azure Blob Storage. Azure Blob Storage temporary tokens are at the container level, whereas ADLS Gen2 tokens can be at the directory level in addition to the container level. Databricks recommends using directory level SAS tokens when possible. The SAS token must have “Read”, “List”, and “Permissions” permissions.
  • AWS STS tokens to read data from AWS S3. Your tokens should have the “s3:GetObject*”, “s3:ListBucket”, and “s3:GetBucketLocation” permissions.

Warning

To avoid misuse or exposure of temporary credentials, Databricks recommends that you set expiration horizons that are just long enough to complete the task.

COPY INTO supports loading encrypted data from AWS S3. To load encrypted data, provide the type of encryption and the key to decrypt the data.

Load data using temporary credentials

The following example loads data from S3 and ADLS Gen2 using temporary credentials to provide access to the source data.

COPY INTO my_json_data
FROM 's3://my-bucket/jsonData' WITH (
  CREDENTIAL (AWS_ACCESS_KEY = '...', AWS_SECRET_KEY = '...', AWS_SESSION_TOKEN = '...')
)
FILEFORMAT = JSON

COPY INTO my_json_data
FROM 'abfss://container@storageAccount.dfs.core.windows.net/jsonData' WITH (
  CREDENTIAL (AZURE_SAS_TOKEN = '...')
)
FILEFORMAT = JSON

Load encrypted data

Using customer-provided encryption keys, the following example loads data from S3.

COPY INTO my_json_data
FROM 's3://my-bucket/jsonData' WITH (
  ENCRYPTION (TYPE = 'AWS_SSE_C', MASTER_KEY = '...')
)
FILEFORMAT = JSON

Load JSON data using credentials for source and target

The following example loads JSON data from a file on Azure into the external Delta table called my_json_data. This table must be created before COPY INTO can be executed. The command uses one existing credential to write to external Delta table and another to read from the ABFSS location.

COPY INTO my_json_data WITH (CREDENTIAL target_credential)
  FROM 'abfss://container@storageAccount.dfs.core.windows.net/base/path' WITH (CREDENTIAL source_credential)
  FILEFORMAT = JSON
  FILES = ('f.json')