Apache Spark to access data from serverless on-demand sql database in Synapse?

Question

Apache Spark to access data from serverless on-demand sql database in Synapse?

Moore, Payton E 101

Looking for a way to access already created serverless on-demand external tables through Apache Spark in Azure Synapse. Any ideas/thoughts?

Accepted answer

1 additional answer

Your answer

Answer 1

Samara Soucy - MSFT 5,141

Hello!

You can find instructions in this post: https://learn.microsoft.com/en-us/answers/questions/428775/connect-synapse-spark-to-synapse-serverless-sql-po.html

Basically, the jbdc driver will allow access to SQL, but the version pre-installed in your Spark Pool doesn't have AAD support. You need to update and add some libraries to your pool to get this to work.

It's worth noting that it's not usually necessary to connect to the serverless pool from spark. All the data in your serverless pool is in storage so you can go straight to the storage account- you would need access permissions to storage anyway. That being said, I can see a scenario where you have some complex views or something similar that you want to tap into from spark.

Moore, Payton E 101 Reputation points

2021-06-24T15:27:15.34+00:00

Thank you for your response @Samara Soucy - MSFT ! When accessing the storage account, is there a way that would enable us to specific folders/directories or files within ADLS ?
Samara Soucy - MSFT 5,141 Reputation points

2021-06-25T17:13:06.357+00:00

Similar to how you would control access in the serverless pool, both Access Control Lists combined with AAD users and SAS tokens are available for scoped access to ADLS.

The TokenLibrary library availble in your Spark pool will work with either option: https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-secure-credentials-with-tokenlibrary?pivots=programming-language-python
Samara Soucy - MSFT 5,141 Reputation points

2021-06-30T03:45:36.81+00:00

Hi @Moore, Payton E
I wanted to check in with you to see if you have any follow up questions for me on this issue.
Moore, Payton E 101 Reputation points

2021-06-30T14:37:26.89+00:00

We decided to access the data lake storage directly, and it worked perfect with the documentation you provided. Thank you! @Samara Soucy - MSFT

Answer 2

arvind malav 1

You can find instructions : https://learn.microsoft.com/en-us/answers/questions/428775/connect-synapse-spark-to-synapse-serverless-sql-po.html

The jbdc driver will allow access to SQL, but the version pre-installed in your Spark Pool doesn't have AAD support. You need to update and add some libraries to your pool to get this to work. It's worth noting that it's not usually necessary to connect to the server less pool from spark. All the data in your server less pool is in storage so you can go straight to the storage account- you would need access permissions to storage anyway.

Moore, Payton E 101 Reputation points

2021-06-30T14:38:25.09+00:00

Thank you for your help @arvind malav

Share via

Apache Spark to access data from serverless on-demand sql database in Synapse?

1 additional answer

Your answer