Incremental / delta Sync two Azure Storage Accounts

Taranjeet Malik 571 Reputation points
2024-03-15T04:05:10.0133333+00:00

Hi

We wish to migrate Azure Storage account (BLOBs and Tables) from one tenant to another. We're dealing with some storage accounts that have large number of objects in a single (of a few) container. For example, total BLOB storage 29.6 GB, but number of items in it are 14649901, with multi-level nesting (so the file / folder path length is quite long).

Tried using AZCopy copy command to move it and it took over 8 hours to migrate the BLOB data. These are some of the critical Storage Account and we cannot afford to take them offline for migration for that long. So wondering if there's a better way to address this scenario by initially seeding the target Storage Account and then performing a incremental / delta sync? I know there's AZCopy sync command that only performs delta sync, but it seems like it needs huge amount of memory to first scan the target Storage Account and then the source one. While scanning, the Windows PowerShell session is abruptly getting killed and I suspect this is to do with memory and CPU utilizationn of the VM where we're performing this, as Task Manager shows almost 100% memory and CPU consumed by AZCopy right before PowerShell exits.

Are there any other MS / Free tools available to address this scenario?

Thanks

Taranjeet Singh

Azure Storage
Azure Storage
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,542 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Nehruji R 8,181 Reputation points Microsoft External Staff Moderator
    2024-03-15T10:44:12.48+00:00

    Hello Taranjeet Malik,

    Greetings! Welcome to Microsoft Q&A Forum.

    I understand that you would like to migrate a large amount of data from one tenant to another, This -https://learn.microsoft.com/en-us/azure/storage/common/storage-choose-data-transfer-solution article provides an overview of some of the common Azure data transfer solutions.

    You can try using Azure Data Factory (ADF) - which is a managed ETL service that orchestrates data movement and transformation. Create an ADF pipeline to copy data between storage accounts. ADF supports incremental data loads and can be scheduled to run at specific intervals.

    Alternatively, try using Azure Storage Explorer which is a powerful graphical tool that allows you to manage Azure Storage resources. While it primarily focuses on managing and exploring storage accounts, it does provide features for data transfer:

    Blob Transfer: You can copy blobs between containers within the same storage account or across different storage accounts. It supports both online (over the internet) and offline (using physical devices like Azure Data Box) transfers. Use it to move blobs efficiently and manage your data.

    Table Transfer: Azure Storage Explorer also allows you to interact with Azure Table Storage. You can view, edit, and transfer table data between storage accounts. However, note that table storage is primarily designed for structured data (key-value pairs) rather than large-scale data migration.

    For online transfers, ensure that the available network bandwidth is sufficient. High network bandwidth (1 Gbps - 100 Gbps) on the Azure VM can improve data transfer performance.

    Similar docs for reference- https://learn.microsoft.com/en-us/answers/questions/1319845/large-blob-storage-migration-from-one-azure-ad-ten, https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage?tabs=data-factory,https://learn.microsoft.com/en-us/training/modules/copy-blobs-using-azure-cli/, https://learn.microsoft.com/en-us/azure/architecture/guide/multitenant/service/storage.

    Hope this answer helps! Please let us know if you have any further queries. I’m happy to assist you further.

    Please "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.