How to write data from notebook to dedicated sql pool

SUMIT B 0 Reputation points
2023-03-01T05:33:47.48+00:00

Hello ,
so i have read excel data from ADLS and i want to write this data to dedicated sql pool how to do this?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,365 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 77,081 Reputation points Microsoft Employee
    2023-03-02T11:57:28.93+00:00

    @SUMIT B Thanks for the question and using MS Q&A platform.

    You may follow the below steps to read excel data from ADLS gen2 to Synapse Dedicated SQL Pool.

    Step1: To read an ecel from the ADLS gen2 - you need to have com.crealytics.spark.excel package from Maven respository.
    Based on the Spark Runtime which you are using you can pick the version - As I using the latest version Apache Spark 3.3 I'm downloading the latest jar file from here: https://mvnrepository.com/artifact/com.crealytics/spark-excel

    User's image

    Step2: Upload the download jar file to workspace packages in Synapse Studio
    Manage => Workspace packages => Upload => Upload the downloaded jar file
    User's image

    Step3: Install the package on the Apache Spark pool which you are using to run the notebook.
    Select Spark pool from workspace => Packages => Force new setting Immediately => Select from workspace packages => Select excel package => Save
    User's image

    Step4: Wait for the package to install - you can monitor the installion here:
    Monitor => Apache Spark Application => SystemReservedJob-LibraryManagement
    User's image

    Step5: Use the sample code to read an excel file from a notebook.

    filePath = "abfss://<container
    sparkDF = spark.read.format("com.crealytics.spark.excel") \
        .option("header", "true") \
        .option("inferSchema", "true") \
        .load(filePath)
    
    

    Step6: Use the sample code to write using AAD Authentication (Internal Table) using python connector:
    Note: If you want to use any other authentication method - you can use the scripts metioned in the article: Azure Synapse Dedicated SQL Pool Connector for Apache Spark

    # Write using AAD Auth to internal table
    # Add required imports
    import com.microsoft.spark.sqlanalytics
    from com.microsoft.spark.sqlanalytics.Constants import Constants
    
    # Configure and submit the request to write to Synapse Dedicated SQL Pool
    # Sample below is using AAD-based authentication approach; See further examples to leverage SQL Basic auth.
    (df.write
     # If `Constants.SERVER` is not provided, the `<database_name>` from the three-part table name argument
     # to `synapsesql` method is used to infer the Synapse Dedicated SQL End Point.
     .option(Constants.SERVER, "<sql-server-name>.sql.azuresynapse.net")
     # Like-wise, if `Constants.TEMP_FOLDER` is not provided, the connector will use the runtime staging directory config (see section on Configuration Options for details).
     .option(Constants.TEMP_FOLDER, "abfss://<container_name>@<storage_account_name>.dfs.core.windows.net/<some_base_path_for_temporary_staging_folders>")
     # Choose a save mode that is apt for your use case.
     # Options for save modes are "error" or "errorifexists" (default), "overwrite", "append", "ignore".
     # refer to https://spark.apache.org/docs/latest/sql-data-sources-load-save-functions.html#save-modes
     .mode("overwrite")
     # Required parameter - Three-part table name to which data will be written
     .synapsesql("sqlpool01.dbo.SampleTable"))
    
    

    As per the repro - I'm able to successfully read and excel file from ADLS gen2 and write to Synapse dedicated SQL pool.

    User's image

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.