How to set a spark configuration specific to a whole workspace or cluster when the configuration contains secrets

Arathi A 1 Reputation point
2022-09-19T10:40:09.723+00:00

Hi,

I have been trying to set a blob container's secrets to the databricks cluster level, but using spark.conf.set('property','key') would always set to the session level only. And directly giving the property key-value in the Cluster -> Advanced Options -> spark-config is not advisable, whereas giving it as property dbutils.secrets.get(scope,key) throws error too.
Can anyone help me out on how to set the container's configuration to a databricks cluster.

I want to access the blob container using databricks without mounting the path to the cluster.(Since I am trying to use a adf copy activity to copy data from databricks table to a SQL DB table and need to use a blob storage as the staging layer where I cant give the mounted path).

Thanks in advance,
Arathi

Azure Key Vault
Azure Key Vault
An Azure service that is used to manage and protect cryptographic keys and other secrets used by cloud apps and services.
1,258 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,790 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,162 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,580 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Carlos Robles Rodriguez 1 Reputation point
    2022-09-19T21:12:39.923+00:00

    The secrets can be Databricks backed or Azure Key Vault backed.

    Both can be created by using the databricks-cli tool, and for the Azure Key Vault backed secrets a hidden web ui can be accessed for you Databricks instance [ https://<databricks-instance>#secrets/createScope ].

    https://learn.microsoft.com/en-us/azure/databricks/security/secrets/secret-scopes

    Use dbutils.secrets.get to access the secret from the notebook.

    secret = dbutils.secrets.get(scope = "<scope-name>", key = "<secret-name>")
    
    0 comments No comments

  2. Arathi A 1 Reputation point
    2022-09-20T16:48:45.787+00:00

    Hi,

    I tried using dbutils.secrets.get in the spark configuration, but was getting error as "The string is not a valid base64-encoded string".(hopefully since the key value is retrieved as REDACTED instead of its real value)

    I actually got a solution for my question,
    https://learn.microsoft.com/en-us/azure/databricks/security/secrets/secrets#--use-a-secret-in-a-spark-configuration-property-or-environment-variable

    But unfortunately cluster owner(cluster creator) can only add reference to a secret in the spark configuration or as an environment variable. Its a general cluster that we use throughout, So if there is any option to update the existing cluster owner that would be great. I want to use the same cluster, and cloning the cluster creates one in a different cluster ID, so is it possible to change the cluster owner or any other option to add secrets would also be helpful.

    Thanks


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.