Databricks packages for batch loading to azure

Ullas Mulbagal Sripathi Rao 46 Reputation points
2021-12-06T12:34:58.377+00:00

hi I am looking efficient way to process and load 5 million records towrite to a azure sql server using data bricks.
I want to retain the schema of the table and hence i truncate the table everytime
I am currently using jdbc but it takes lot of time to load and I have to increase the DTU every time which isnt a good practice to do so
i referred to this link but i don't find the required packages in databricks to import
https://stackoverflow.com/questions/55708079/spark-optimise-writing-a-dataframe-to-sql-server/55717234

currenlty I am loading it using the below command
df.write.format("jdbc").option("url", sqlDwUrlSmall).option("forward_spark_azure_storage_credentials","True").option("truncate",true).option("dbTable", schemaName+"."+Tblname).mode("overwrite").save()

Azure SQL Database
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,969 questions
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 79,381 Reputation points Microsoft Employee
    2021-12-08T08:11:26.41+00:00

    Hello @Ullas Mulbagal Sripathi Rao ,

    Thanks for the question and using MS Q&A platform.

    Compared to the built-in JDBC connector, this connector provides the ability to bulk insert data into SQL databases. It can outperform row-by-row insertion with 10x to 20x faster performance. The Spark connector for SQL Server and Azure SQL Database also supports Azure Active Directory (Azure AD) authentication, enabling you to connect securely to your Azure SQL databases from Azure Databricks using your Azure AD account. It provides interfaces that are similar to the built-in JDBC connector. It is easy to migrate your existing Spark jobs to use this connector.

    How to find the Azure Spark Connector in Azure Databricks?

    Go to libraries => Install New => Select Maven => Maven Search => spark-mssql-connector_2.12

    159011-adb-maveninstallpackage.gif

    For more details, refer to SQL Databases using the Apache Spark connector.

    For instructions on using the Spark connector, see Apache Spark connector: SQL Server & Azure SQL.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

0 additional answers

Sort by: Most helpful