Databricks connect to Azure synapse dedicated pool

sakuraime 2,341 Reputation points
2021-03-22T13:42:47.147+00:00

I have the following questions about connecting to Azure synapse dedicated pool from databricks

  1. How to connect to Azure synapse dedicated pool using private link?
  2. If I am going to use Service principal from databricks ( I have a key vault to store the service principal secret , I have set spark.conf.set

spark.databricks.sqldw.jdbc.service.principal.client.id, spark.databricks.sqldw.jdbc.service.principal.client.secret,
what should be the url for df.write.format("com.databricks.spark.sqldw")? Should I need to use option(username)...??

  1. And I have issue during write dataframe to Azure synapse
    80271-image.png
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,514 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA 90,641 Reputation points Moderator
    2021-03-23T09:48:07.327+00:00

    Hello @sakuraime ,

    Connecting Azure Synapse using OAuth 2.0 with a service principal for authentication is available from Databricks Runtime 8.1 and above.

    Below are the steps to connect Azure Synapse using OAuth 2.0 with a service principal for authentication:

    Step1: Provide service Principal – permissions to Azure Synapse Analytics and storage account.

    Azure Synapse Analytics: Go to workspace => Under settings => SQL Active Directory admin => Click on Set admin => Add registered application => Click on save.

    81441-image.png

    Azure Storage temp account: Go to Storage account => Access Control (IAM) => Add role assignment => Select Role: Storage Blob Data Contributor Select: register application => Click on save.

    81318-image.png

    Step2: Define Service Principal credentials for the Azure Synapse Analytics and storage account.

    Python Code:

    # Defining the service principal credentials for the Azure storage account  
    spark.conf.set("fs.azure.account.auth.type", "OAuth")  
    spark.conf.set("fs.azure.account.oauth.provider.type",  "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")  
    spark.conf.set("fs.azure.account.oauth2.client.id", "<application-id>")  
    spark.conf.set("fs.azure.account.oauth2.client.secret", "<service-credential>")  
    spark.conf.set("fs.azure.account.oauth2.client.endpoint", "https://login.microsoftonline.com/<directory-id>/oauth2/token")  
      
    # Defining a separate set of service principal credentials for Azure Synapse Analytics (If not defined, the connector will use the Azure storage account credentials)  
    spark.conf.set("spark.databricks.sqldw.jdbc.service.principal.client.id", "<application-id>")  
    spark.conf.set("spark.databricks.sqldw.jdbc.service.principal.client.secret", "<service-credential>")  
    

    81402-image.png

    Step3: Set the enableServicePrincipalAuth option in the connection configuration.

    Sample URL: "url", "jdbc:sqlserver://<workspacename>.sql.azuresynapse.net:1433;database=<databasename>;encrypt=true;trustServerCertificate=true;hostNameInCertificate=*.sql.azuresynapse.net;loginTimeout=30"

    Load data from an Azure Synapse query.

    Python Code:

    df = spark.read \  
      .format("com.databricks.spark.sqldw") \  
      .option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>") \  
      .option("tempDir", "abfss://<your-container-name>@<your-storage-account-name>.dfs.core.windows.net/<your-directory-name>") \  
      .option("forwardSparkAzureStorageCredentials", "true") \  
      .option("dbTable", "<your-table-name>") \  
      .load()  
    

    81422-image.png

    Write data to the table in Azure Synapse.

    Python Code:

    df.write \  
      .format("com.databricks.spark.sqldw") \  
      .option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>") \  
      .option("forwardSparkAzureStorageCredentials", "true") \  
      .option("dbTable", "<your-table-name>") \  
      .option("tempDir", "abfss://<your-container-name>@<your-storage-account-name>.dfs.core.windows.net/<your-directory-name>") \  
      .save()  
    

    81397-image.png

    Hope this helps. Do let us know if you any further queries.

    ------------

    Please don’t forget to Accept Answer and Up-Vote wherever the information provided helps you, this can be beneficial to other community members.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.