Can we reduce or avoid the calls to GetBlobProperties?

Thomas, Tony 20 Reputation points
2024-08-07T14:46:02.4+00:00

We have recently adopted Azure blob as our DR strategy and are in the process of getting rid of our DR virtual machines. However, we hit the below roadblock which is a showstopper for this modernization. I would appreciate a review and advice from Azure experts to see if there are any workarounds or alternate solutions.

We have a process which syncs a folder from the production application (Corpus) to the Azure container. Sync Process is a PowerShell which uses the "Azcopy copy" tool and we are only copying the delta as we use the parameter "--overwrite=false". Looks like this still scans through millions of files which were already uploaded in the past, though not recopying. Below is the code we use to sync the container from the On-premise folder.

C:\Software\azcopy\azcopy copy "\R7-IRB-P-APP\Corpus" "<

Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,218 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,916 questions
{count} votes

Accepted answer
  1. Nehruji R 8,066 Reputation points Microsoft Vendor
    2024-08-08T13:34:15.2+00:00

    Hello Thomas, Tony,

    Greetings! Welcome to Microsoft Q&A Platform.

     

    I understand that you are encountering issue with AzCopy scanning, If you have a large number of files, use the azcopy copy command instead, and set the --overwrite flag to ifSourceNewer. Az Copy will compare files as they are copied without performing any up-front scans and comparisons. This provides a performance edge in case where there are a large number of files to compare.  Replace --overwrite=false with --overwrite=ifSourceNewer by using --overwrite=ifSourceNewer this parameter checks only files that are newer in the source are copied, then it reduces the amount of scanning.

     

    The "Azcopy copy" command doesn't delete files from the destination, so if you want to delete files at the destination when they no longer exist at the source, then use the "Azcopy sync" command with the --delete-destination flag set to a value of true or prompt. AzCopy needs to compare metadata like timestamps and sizes of files at both the source and destination to determine if the files are new or have changed. It must enumerate both source and destination file lists to check whether it only copies new or modified files. This is necessary to maintain data consistency and avoid multiple transfers.

     

    Create a list of files that need to be copied and use the --list-of-files parameter to limit AzCopy operation to only those files.

     

    Make sure you are using an appropriate performance tier for your Azure Blob Storage like Premium Blob Storage for higher performance needs.

      

    Here is the doc for your reference: https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-blobs-synchronize, https://learn.microsoft.com/en-us/azure/storage/common/storage-ref-azcopy-sync

     

    Hope this answer helps! please let us know if you have any further queries. I’m happy to assist you further.

    Please "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.