Best way to load from Azure SQL DB to dedicated SQL pool

kosmos 246 Reputation points
2022-09-26T14:09:05.57+00:00

I want to move data from an Azure SQL Database to Synapse. I have created a SQL Dedicated pool with the following conditions:

1) Load all the tables from the database automatically, without the need to be creating for each table a pipeline.

2) Load all column names without the need to create schemas myself, simply copying the data as it is in the original Azure SQL Database. That means I do not want to "enable staging" under "settings" in the pipeline. That means I want to be able to use auto create table.

3) I do not need polybase because the amount of data is not big.

I have read the documentation and I built the following jobs:

  • A pipeline from Azure SQL DB to Datalake --> Which somehow does not copy the column names
  • A pipeline from Datalake to SQL dedicated pool

This approach would work but somehow I have the following thoughts

  1. The column names are not copied in the first pipeline Azure SQL DB to Datalake
  2. I believe there is a smarter way to do this

What is the best approach to my current requirements ?

Thanks in advance.

Azure SQL Database
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,396 questions
{count} votes

2 answers

Sort by: Most helpful
  1. kosmos 246 Reputation points
    2022-09-27T08:57:28.05+00:00

    Figured out the issue with the column names in the Data Lake

    Go to your pipeline --> Sink --> Open --> First row as header

    2 people found this answer helpful.
    0 comments No comments

  2. AnnuKumari-MSFT 31,061 Reputation points Microsoft Employee
    2022-09-27T10:53:41.293+00:00

    Hi @kosmos ,

    Thankyou for using Microsoft Q&A platform and thanks for posting your question here.

    As per my understanding, you want to know what should be the best approach in loading data from Az SQL DB to Dedicated SQL pool. Please let me know if this is not what you wanted to ask.

    I can see your issue has been resolved by enabling 'First row as header' option in sink settings. However, still I would like to throw some light on the best way to copy data in Synapse SQL.

    • If your source data is in Azure Blob or Azure Data Lake Storage Gen2, and the format is COPY statement compatible, you can use copy activity to directly invoke COPY statement to let Azure Synapse Analytics pull the data from source. For details, see Direct copy by using COPY statement.
    • If your source data store and format isn't originally supported by COPY statement, use the Staged copy by using COPY statement feature instead. The staged copy feature also provides you better throughput. It automatically converts the data into COPY statement compatible format, stores the data in Azure Blob storage, then calls COPY statement to load data into Azure Synapse Analytics.

    Note : To achieve best performance, use PolyBase or COPY statement to load data into Azure Synapse Analytics.
    Additionally, You are suggested to enable parallel copy with data partitioning especially when you load large amount of data from your Azure Synapse Analytics.
    Source: Parallel copy from Azure Synapse Analytics

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    2 people found this answer helpful.