Apache Spark to access data from serverless on-demand sql database in Synapse?

Moore, Payton E 101 Reputation points
2021-06-23T14:30:54.003+00:00

Looking for a way to access already created serverless on-demand external tables through Apache Spark in Azure Synapse. Any ideas/thoughts?

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,340 questions
.NET
.NET
Microsoft Technologies based on the .NET software framework.
3,373 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,368 questions
0 comments No comments
{count} votes

Accepted answer
  1. Samara Soucy - MSFT 5,051 Reputation points
    2021-06-24T14:41:03.13+00:00

    Hello!

    You can find instructions in this post: https://learn.microsoft.com/en-us/answers/questions/428775/connect-synapse-spark-to-synapse-serverless-sql-po.html

    Basically, the jbdc driver will allow access to SQL, but the version pre-installed in your Spark Pool doesn't have AAD support. You need to update and add some libraries to your pool to get this to work.

    It's worth noting that it's not usually necessary to connect to the serverless pool from spark. All the data in your serverless pool is in storage so you can go straight to the storage account- you would need access permissions to storage anyway. That being said, I can see a scenario where you have some complex views or something similar that you want to tap into from spark.


1 additional answer

Sort by: Most helpful
  1. arvind malav 1 Reputation point
    2021-06-30T03:48:55.237+00:00

    You can find instructions : https://learn.microsoft.com/en-us/answers/questions/428775/connect-synapse-spark-to-synapse-serverless-sql-po.html

    The jbdc driver will allow access to SQL, but the version pre-installed in your Spark Pool doesn't have AAD support. You need to update and add some libraries to your pool to get this to work. It's worth noting that it's not usually necessary to connect to the server less pool from spark. All the data in your server less pool is in storage so you can go straight to the storage account- you would need access permissions to storage anyway.