Cannot get secret from Azure synapse integrate pipeline

Caio Cavalcanti 1 Reputation point
2022-06-07T09:44:49.417+00:00

I'm trying to create a Synapse pipeline that executes a notebook which depends on a secret that I'm getting from mssparkutils.credentials.getSecret. When I run the notebook manually from Develop I don't get any errors, but when I try to execute the notebook from a pipeline it fails with one of the following errors:

If I pass just the key vault and secret name, as in secret = mssparkutils.credentials.getSecret("keyVaultName", "secretName"), then I get the error:

An error occurred while calling z:mssparkutils.credentials.getSecret.  
 java.lang.Exception: Access token couldn't be obtained  
 BadRequest - LSRServiceException - CannotAcquireMSIForVault: Cannot acquire MSI token for a Vault audience.  

If I pass the key vault, secret and linked service name, as in secret = mssparkutils.credentials.getSecret("keyVaultName", "secretName", "linkedServiceName"), then I get an error like this:

An error occurred while calling z:mssparkutils.credentials.getSecret  
java.lang.Exception: Access token couldn't be obtained.  
BadRequest - LSRServiceException - LSRLinkedServiceFailure: Could not find Linked Service linkedServiceName; the linked service does not exist or is not published  

I've confirmed that (1) the linked service is published and the connection is working and (2) my Synapse managed identity does have get/list access to secrets on the target key vault

I found some similar issues but none seemed to fix my problem.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,395 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. ShaikMaheer-MSFT 37,896 Reputation points Microsoft Employee
    2022-06-08T07:04:12.45+00:00

    Hi @Caio Cavalcanti ,

    Thank you for posting query in Microsoft Q&A Platform.

    Looks strange. Even I am ending up with similar issue as you mentioned. But I tried this same in past and recorded a YouTube video on same. That time it was working fine. Please check this video for same.

    As a work around you can try below code to get secret from key vault.

    import sys  
    from pyspark.sql import SparkSession  
      
    sc = SparkSession.builder.getOrCreate()  
    token_library = sc._jvm.com.microsoft.azure.synapse.tokenlibrary.TokenLibrary  
      
    connection_string = token_library.getSecret("AKV-cshaik", "testSecret", "AzureKeyVault1")  
    print(connection_string)  
    

    I tried above code. It working good. Please consider using same in your case too.
    209288-image.png

    Please note, I informed to PG about getSecret() function issue. Waiting for there response. I am hoping PG team to fix this issue or update documentation accordingly. I will update thread after getting updates. Thank you for spoting this.

    Hope this helps. Please let us know if any further queries.

    -----------------

    Please consider hitting Accept Answer button. Accepted answers help community as well.

    1 person found this answer helpful.

  2. Miroslav Muras 25 Reputation points
    2023-01-28T17:43:30.7733333+00:00

    HI, I am still experiencing the same error while running the Synapse pipeline and accessing the Azure Key Vault secretes {multiple secrets) Through the Spark notebook. My Access Policy is set up correctly.

    If I need access to only a single secret, I will use the online sources (creating Web activities), which would be fine, but the script is accessing multiple secrets for the runtime of the pipeline.

    @ShaikMaheer-MSFT any update on the ticket?

    Thank you.


  3. Maksim 0 Reputation points
    2023-10-11T16:53:59.6633333+00:00

    Run today in the same problem:

    I have this configuration in Spark Notebook:

    %%configure -f
    {
        "conf": {
            "spark.kryoserializer.buffer.max": "1024m",
            "spark.synapse.logAnalytics.enabled": true,
            "spark.synapse.logAnalytics.keyVault.name": "KeyVaultName"
        }
    }
    

    and getting this error in logs:

    LogAnalyticsConfigurationLoader-0 ERROR Could not get Log Analytics workspace id from AKV. POST failed with 'Bad Request' (400) and message: {"result":"DependencyError","errorId":"BadRequest","errorMessage":"[Code=CannotAcquireMSIForVault, Target=Vault, Message=Cannot acquire MSI token for a Vault audience.]. TraceId : ****** | client-request-id : *******. Error Component : LSR"}
    

    This only happens, if I trigger an pipeline with notebook. Running notebook manual works.