Azure Storage Mover scale and performance targets
The performance of a storage migration service is a key aspect for any migration. Azure Storage Mover is a new service, in public preview. In this article, we share performance test results - your experience will vary.
Azure Storage Mover is tested with 100 million namespace items (files and folders), migrated from a supported source to a supported target in Azure.
How we test
Azure Storage Mover is a hybrid cloud service. Hybrid services have a cloud service component and an infrastructure component the administrator of the service runs in their corporate environment. For Storage Mover, that hybrid component is a migration agent. Agents are virtual machines, ran on a host near the source storage.
Only the agent is a relevant part of the service for performance testing. To omit privacy and performance concerns, data travels directly from the Storage Mover agent to the target storage in Azure. Only control and telemetry messages are sent to the cloud service.
The following table describes the characteristics of the test environment that produced the performance test results shared later in this article.
|Test namespace||19% files 0 KiB - 1 KiB
57% files 1 KiB - 16 KiB
16% files 16 KiB - 1 MiB
|Test source device||Linux server VM
16 virtual CPU cores
|Test source share||NFS v3.0 share
Warm cache: Data set in memory (baseline test). In real-world scenarios, add disk recall times.
|Network||Dedicated, over-provisioned configuration, negligible latency. No bottle neck between source - agent - target Azure storage.)|
These test results are created under ideal conditions. They're meant as a baseline of the components the Storage Mover service and agent can directly influence. Differences in source devices, disks, and network connections aren't considered in this test. Real-world performance will vary.
Different agent resource configurations are tested:
4 virtual CPU cores at 2.7 GHz each and 8 GiB of memory (RAM) is the minimum specification for an Azure Storage Mover agent.
|Test||Single file, 1 TiB||˜3.3M files, ˜200-K folders, ˜45 GiB||˜50M files, ˜3M folders, ˜1 TiB|
|Elapsed time||16 Min, 42 Sec||15 Min, 18 Sec||5 Hours, 28 Min|
|Items* per Second||-||3548||2860|
|Memory (RAM) usage||400 MiB||1.4 GiB||1.4 GiB|
|Disk usage (for logs)||28 KiB||1.4 GiB||result missing|
*A namespace item is either a file or a folder.
Review recommended agent resources for your migration scope in the agent deployment article.
Why migration performance varies
Fundamentally, network quality and the ability to process files, folders and their metadata impact your migration velocity.
Across the two core areas of network and compute, several aspects have an impact:
- Migration scenario
Copy into an empty target is faster as compared to a target with content. This is because the migration engine must evaluate not only the source but also the target to make copy decisions.
- Namespace item count
Migrating 1 GiB of small files will take more time than migrating 1 GiB of larger files.
- Namespace shape
A wide folder hierarchy lends itself to more parallel processing than a narrow or deep directory structure. The file to folder ratio also plays a roll.
- Namespace churn
How many files, folders, and metadata have changed between two copy runs from the same source to the same target.
- bandwidth and latency between source and migration agent
- bandwidth and latency between migration agent and the target in Azure
- Migration agent resources
The amount of memory (RAM), number of compute cores, and even the amount of available, local disk capacity on the migration agent can have a profound impact on the migration velocity. More compute resources help to optimize the utilization of the available bandwidth, especially when large amounts of smaller files need to be processed in a migration.
For example, a traditional migration requires a strategy to minimize downtime of the workload that depends on the storage that is to be migrated. Azure Storage Mover supports such a strategy. It's called convergent, n-pass migration.
In this strategy, you copy from source to target several times. During these copy iterations, the source remains available for read and write to the workload. Just before the final copy iteration, you take the source offline. It's expected that the final copy finishes faster than say the first copy you've ever made and takes about as long as the one immediately preceding it. After the final copy, the workload is failed over to use the new target storage in Azure and available for use again.
During the first copy from source to target, the target is likely empty and all the source content must travel to the target. As a result, the first copy is likely most constrained by the available network resources.
Towards the end of a migration, when you've copied the source to the target several times already, only a few files, folders, and metadata has changed since the last copy. In this last copy iteration, comparing each file in source and target to see if it needs to be updated, requires more compute resources and fewer network resources. Copy runs in this late stage of a migration are often more compute-constrained. Proper resourcing of the Storage Mover agent becomes more and more important.
The following articles can help with a successful Azure Storage Mover deployment.
Submit and view feedback for