Decrypt a column of data in Python Notebook

Visakh 211 Reputation points Volunteer Moderator
2023-02-27T15:25:31.05+00:00

Hello Team
I have a use case which required the PII source data maintained in a SQL Server to be encrypted prior to ingesting to our data lake (ADLS gen 2). Currently this is done through a SQL stored procedure utilizing EncryptByKey function.

The data in the encrypted form is maintained within Parquet files inside an ADLS account.

The requirement is to allow only a class of privileged users to decrypt this data while accessing through Python code within Synapse spark pools.

The certificate and private key originally used for encryption is backed up and made available to us.

Any ideas on how we can use the information to decrypt the data through Python notebook code only for an identified set of users within Synapse?
Thanks in advance for any advice provided.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
{count} vote

1 answer

Sort by: Most helpful
  1. KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator
    2023-02-28T23:59:13.26+00:00

    Hi @Visakh ,

    Welcome to Microsoft Q&A forum and thanks for reaching out here.

    As per my understanding you would like to know how a particular set of privileged users can be able to decrypt the PII data while working on Synapse notebooks (interactive authoring) using python code.

    Since you want only particular set of privileged users to decrypt the files data, what you can do is:

    1. Create a AAD user group for those privileged users.
    2. Create an Azure Key vault which store the Decryption key and certificate which can decrypt the parquet files in ADLS Gen2.
    3. Then grant the AAD group access to the Azure Key vault.
    4. Now you can utilize Azure Key Vault Secrets Client library for Python to retrieve the secrets which contain Your decryption key and certificate values, and you can use to decrypt your secured files from ADLS Gen2.
       from azure.identity import DefaultAzureCredential
       from azure.keyvault.secrets import SecretClient
       
       credential = DefaultAzureCredential()
       
       secret_client = SecretClient(vault_url="https://my-key-vault.vault.azure.net/", credential=credential)
       secret = secret_client.get_secret("secret-name")
       
       print(secret.name)
       print(secret.value)
    
    1. By doing so, when those users' login to Azure Synapse studio with their login credentials and try to execute those Python notebooks, the key vault will try to validate their permission against their credentials and if the user doesn't have permission to the key vault, it will throw error. Here is video by one of my teammate which explains how to give permission to Azure Key Vault for Synapse Notebook in Synapse Analytics: Quickstart: Azure Key Vault secret client library for Python

    Hope this info helps.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.