Copy activity successfully loads more rows than Write Batch Size in Azure pipeline

Question

Copy activity successfully loads more rows than Write Batch Size in Azure pipeline

Surya Raja 20

In an Azure pipeline copy activity, I set the Write Batch Size to 1000, but it successfully loaded 1200 rows from a CSV file in Azure blob storage to an Azure SQL database. I expected it to fail since batch size was smaller than the number of rows, what happened here? Are there any explanations for this behavior?

Accepted answer

0 additional answers

Your answer

Answer 1

Hi Surya Raja:

Thanks for reaching out to Microsoft Q&A.

For your easy understanding, I am copying a answer from 'BhargavGunnam' for an similar question that has been asked earlier. Your's didn't fail because it has started the second batch as your batchsize parameter was lower than the amount of rows in your source.

I hope this answers your doubt, let me know if this helped.

ADF determines the batch size based on the writeBatchSize property and the number of rows in the source data. If the number of rows in the source data is less than the writeBatchSize, then the entire data is processed in a single batch. If the number of rows in the source data is greater than the writebatchsize, then the data is processed in multiple batches, with each batch containing writeBatchSize rows. If you left this value blank, the value of "Write Batch Size" will be determined dynamically based on the row size by the Azure Data Factory as per documentation. This is the default behavior of the Copy Activity when the "Write Batch Size" property is not specified. To control the batch size, you can also explicitly set the "Write Batch Size" property to a specific value. However, if you set the value too high, it may cause performance issues or timeouts. It's recommended to test and optimize the batch size based on your specific scenario. In case if you still want to control the batchidentifier parameter in the insert and update query then you can try dynamic SQL to generate the table name with the GUID suffix and use it in your SP. Please note: It is not recommended to modify this parameter as it is an internal parameter used by ADF to manage the interim table.

User's image

Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.

Share via

Copy activity successfully loads more rows than Write Batch Size in Azure pipeline

0 additional answers

Your answer