Spark 3.3 version upgrade getting fail with Error : Failure(java.lang.ClassNotFoundException: Failed to find data source: com.microsoft.sqlserver.jdbc.spark. Please find packages athttps://spark.apache.org/third-party-projects.html )

Akki, Siva Gopi 20 Reputation points
2024-06-21T06:16:46.26+00:00

Dear Support-Team,

As we aware Spark 3.2 version going to decommission from July 8, so, we are planning to upgrade our spark pools from 3.2 to 3.3 version.

Fyi, we are maintaining metadata in Microsoft SQL Server. while using spark 3.2 version in azure synapse notebook we are able to connect to MSSQL without any dependencies, but with same set of code planning to test it with spark 3.3 version facing some library dependencies issue.

Below are error details for your referance :

Failure(java.lang.ClassNotFoundException: Failed to find data source: com.microsoft.sqlserver.jdbc.spark. Please find packages athttps://spark.apache.org/third-party-projects.html       )

With out adding third party libraries like 'mssql-spark-beta connector '

can't we connect MsSQL from Spark 3.3 ?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,597 questions
0 comments No comments
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 83,886 Reputation points Microsoft Employee
    2024-06-21T09:39:37.0733333+00:00

    @Akki, Siva Gopi - Thanks for the question and using MS Q&A platform.

    Based on the release notes for Spark 3.3 and Spark 3.2 in Azure Synapse Analytics:

    With the release of Spark 3.3 in Azure Synapse Analytics, there are some changes to the runtime environment that may affect your existing Spark applications. One of the changes is that the default version of Scala has been updated to 2.12. You may need to update your code to be compatible with this new version of Scala.

    In addition, some third-party libraries that were previously included in the Spark distribution have been removed. If your Spark applications depend on these libraries, you will need to add them to your Spark environment manually.

    Regarding your specific issue with connecting to Microsoft SQL Server, the release notes for Spark 3.3 do not mention any changes to the SQL Server JDBC driver. However, it is still possible that the required library for the driver is missing from your Spark environment. You can follow the steps I provided earlier to download and add the JDBC driver to your Spark environment.

    As for Spark 3.2, it is important to note that this version is no longer supported in Azure Synapse Analytics as of July 8, 2022. If you have not already done so, you should upgrade your Spark pools to version 3.3 to ensure continued support and access to new features and improvements.

    In summary, upgrading to Spark 3.3 in Azure Synapse Analytics may require some changes to your existing Spark applications, such as updating your code to be compatible with the new version of Scala and adding any required third-party libraries to your Spark environment. Additionally, it is important to upgrade to Spark 3.3 to ensure continued support and access to new features and improvements, as Spark 3.2 is no longer supported.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful