Within my Sink of my Data Flow it took 3 minutes 16 seconds to write 106 rows. How do I reduce this time to write rows to a reasonable level?

Kieran Wood 71 Reputation points
2023-10-26T10:21:02.3566667+00:00

Within my Delta Sink of my Data Flow it took 3 minutes 16 seconds to write 106 rows. How do I reduce this time to write rows to a reasonable level?

The data source is of type SAP CDC and has 20 million rows and is managing deltas correctly. The source activity executed in a more reasonable 21 seconds.

The integration runtime is...

Compute Size: Custom

Compute Type: Basic (General Purpose),

Core Count: 8 Driver Cores,

Time to Live 1 hour

I confirm that I have read the links ...

https://learn.microsoft.com/en-us/azure/data-factory/concepts-data-flow-performance

https://learn.microsoft.com/en-us/azure/data-factory/concepts-data-flow-performance#optimizing-sinksSinkTooSlow

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,200 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
3,826 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
8,522 questions
{count} votes

Accepted answer
  1. Amira Bedhiafi 8,716 Reputation points
    2023-10-26T12:30:22.9633333+00:00

    Even though the source activity executed reasonably fast, it's essential to understand the complexity of the CDC.

    In many implementations, CDC can introduce latency, especially when handling large datasets.

    Sometimes, the source might have complex joins, aggregations, or other operations that slow down the sink.

    If your source and sink are in different regions, I assume you can experience added latency. Maybe in this case the source and the sink need to be in the same region if possible. Also, check that other processes aren't competing for resources

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful