Copy Dataverse data into Azure SQL using Synapse Link

Murali Ramasamy 5 Reputation points
2024-10-23T15:49:18.2433333+00:00

Im using Azure Synapse pipeline to copy data from Dataverse to Azure SQL DB. For large tables, it is only copying 10 million records and it is not inserting after. I have about 15 mil records.

I dont see any settings limitation anywhere on the pipeline. Anyone solution for this?

Thanks

Murali

Azure SQL Database
Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,479 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,980 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,812 questions
Microsoft Dataverse Training
Microsoft Dataverse Training
Microsoft Dataverse: A Microsoft service that enables secure storage and management of data used by business apps. Previously known as Common Data Service.Training: Instruction to develop new skills.
37 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 25,946 Reputation points
    2024-10-23T20:54:58.3566667+00:00

    Although you mentioned not finding any settings limitations, it’s worth reviewing if there are any limits related to data transfers, particularly on the Dataverse side or within Synapse when dealing with large volumes of data.

    Dataverse might have API throttling or batch size limitations. Check the Dataverse API limits, which could potentially restrict the number of records you can retrieve in a single operation.

    You may need to verify that the batch size for your data copy activity is optimized. Sometimes, the default batch size may not be sufficient for large datasets. You can adjust the batch size in the settings of your Copy Activity within the Synapse pipeline.

    For tables with more than 10 million records, consider using an incremental loading strategy. Instead of copying the entire table at once, you could break the data transfer into smaller chunks, either by partitioning the data or by using a date-based incremental load.

    You can achieve this by configuring the Synapse pipeline to copy data based on specific conditions, such as a timestamp or an index column, in batches.

    Review and increase the Data Movement Units (DMUs) in Azure Synapse. DMUs control the throughput of data movements between different services (like Dataverse to SQL). For larger data volumes, increasing DMUs could help in faster and more consistent data transfers.

    Consider leveraging Dataflows or PolyBase if available. PolyBase allows you to load large datasets efficiently into Azure SQL from Azure Data Lake or other external sources.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.