How do I improve Performance for blob storage json file merge?

Sam Bell 1 Reputation point
2021-08-17T15:14:47.973+00:00

I am using Copy Data in Data factory to merge a large number of json files. At the moment it is taking about 30 seconds to process a minutes worth of files, processing an hour took 21 minutes.
Looking at the transfer times they ar eall in the ball park of what I would expect.
listing source : 5 secs
reading from source: 2:42
writing to sink: 0:00

The write time seems suspicious, but I assume that as they are run in parallel it just means the write didn't take any extra time.
The overall time listed though is 20:57, is there anywhere I can look to explain the large disaparity, as non of the performance tips would seem to apply, so I'm not sure how to get performance to a reasonable level.

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,426 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,527 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 37,896 Reputation points Microsoft Employee
    2021-08-18T07:43:58.627+00:00

    Hi @Sam Bell ,

    Welcome to Microsoft Q&A Platform. Thank you for posting your query here.

    Did you get chance to look below documentation links which helps to increase performance of copy in azure data factory?
    https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-performance
    https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-performance-features

    Troubleshoot Copy activity Performance:
    https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-performance-troubleshooting

    Kindly let us know if anything specific we can help on after checking above link. Thank you.