Using AzCopy to copy file share data to another storage account and setting AZCOPY_CONCURRENCY_VALUE

Gregor Anton Grinč 171 Reputation points
2023-09-27T19:10:30.63+00:00

Hello,

I am using AzCopy to copy data from one file share to another on a different storage account. The size of the file share is almost 300 GB and I am thinking about how I can optimize the transfer speed of such operation.

Here (https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-optimize) is mentioned this:

You can increase throughput by setting the AZCOPY_CONCURRENCY_VALUE environment variable. This variable specifies the number of concurrent requests that can occur. If your computer has fewer than 5 CPUs, then the value of this variable is set to 32. Otherwise, the default value is equal to 16 multiplied by the number of CPUs. The maximum default value of this variable is 300, but you can manually set this value higher or lower.

But since I am copying between storage accounts does this apply to me as well? Cannot I just use the maximum concurrency value possible? How can I make sure that I transfer this amount of data in the shortest time possible?

It is also mentioned that reducing the size of each job can be a good way to increase performance. To be specific:

To achieve optimal performance, ensure that each jobs transfers fewer than 10 million files. Jobs that transfer more than 50 million files can perform poorly because the AzCopy job tracking mechanism incurs a significant amount of overhead. To reduce overhead, consider dividing large jobs into smaller ones.

How do I know how many files I am transferring in each job?

Thank you for your answers

Azure Files
Azure Files
An Azure service that offers file shares in the cloud.
1,170 questions
Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
2,716 questions
{count} votes

1 answer

Sort by: Most helpful
  1. ekpathak 15 Reputation points Microsoft Employee
    2023-10-03T04:48:39.3633333+00:00

    Hello Gregor Anton Grinč, Welcome to Microsoft Q&A. Thank you for posting your query!

    To increase throughput when copying between storage accounts using AzCopy, you can set the AZCOPY_CONCURRENCY_VALUE environment variable to a value greater than 1000. This variable specifies the number of concurrent requests that can occur. By increasing the concurrency value, you can potentially achieve higher throughput and transfer data in a shorter time.

    However, it's important to note that the maximum default value of AZCOPY_CONCURRENCY_VALUE is 300. While you can manually set this value higher, it's recommended to monitor the CPU, memory utilization, and network bandwidth of the machine running AzCopy. If you're hitting resource limits or experiencing high CPU usage, you may need to adjust the concurrency value accordingly.

    Additionally, consider optimizing other factors such as VM size, network latency, and using the --sync copy switch for faster and more consistent copy speeds. By following these best practices, you can maximize the performance of AzCopy and ensure efficient data transfer between storage accounts.

    To know how many files you are transferring in each job using AzCopy, you can use command parameters such as include path and include-pattern to specify the subset of directories or files you want to copy. By dividing large jobs into smaller ones and specifying the desired files or directories, you can effectively manage the number of files being transferred in each job.

    Please let me know if you have any questions.