Synapse can't find data source com.microsoft.sqlserver.jdbc.spark

Mike Wong 46 Reputation points
2023-05-19T10:17:41.2633333+00:00

Hello,

I am connecting to an Azure SQL database using Synapse notebooks and the following block of code:


df_config = spark.read.format("com.microsoft.sqlserver.jdbc.spark") \

    .option("url",url) \

    .option("dbtable", f"{SchemaName}.{TableName}") \

    .option("databasename", Database) \

    .option("accessToken", access_token) \

    .option("encrypt", "true") \

    .option("hostNameInCertificate", "*.database.windows.net") \

    .load()

This worked fine for the entire week and now I'm suddenly getting this error overnight:

Py4JJavaError: An error occurred while calling o3930.load. : java.lang.ClassNotFoundException: Failed to find data source: com.microsoft.sqlserver.jdbc.spark. Please find packages at https://spark.apache.org/third-party-projects.html

I don't really understand why I'm getting this error. I have allowed session packages on the Apache Spark pool, I am using the latest versions of Apache Spark and Python available to the cluster and, most importantly, have been using this code all week and nothing has changed. The error above suggests that that source does not exist although it's available out of the box in Synapse notebooks.

Does anybody have any answers? Thank you!

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,422 questions
{count} votes

Accepted answer
  1. BhargavaGunnam-MSFT 26,496 Reputation points Microsoft Employee
    2023-05-22T21:57:57.87+00:00

    Hello Mike Wong and Bandhit Suksiri,

    This connector jar is still in beta, and there is no official stable version yet. So due to this reason, you are seeing the error "spark-mssql-connector jar missing from spark 3.3"

    The Resolution is to add this jar https://repo1.maven.org/maven2/com/microsoft/azure/spark-mssql-connector_2.12/1.3.0-BETA/spark-mssql-connector_2.12-1.3.0-BETA.jar   to their spark pools using package management feature

    https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-manage-pool-packages

    https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-azure-portal-add-libraries

    This issue should be resolved once you add the jar.

    Please see this video tutorial for how to add the jar file to the Spark pool.

    I hope this helps. Please let me know if you have any further questions.

    If this answers your question, please consider accepting the answer by hitting the Accept answer and up-vote as it helps the community look for answers to similar questions. 

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful