Azure Storage Mover scale and performance targets

The performance of a storage migration service is a key aspect for any migration. Azure Storage Mover is a new service, in public preview. In this article, we share performance test results - your experience will vary.

Scale targets

Azure Storage Mover is tested with 100 million namespace items (files and folders), migrated from a supported source to a supported target in Azure.

How we test

Azure Storage Mover is a hybrid cloud service. Hybrid services have a cloud service component and an infrastructure component the administrator of the service runs in their corporate environment. For Storage Mover, that hybrid component is a migration agent. Agents are virtual machines, ran on a host near the source storage.

Only the agent is a relevant part of the service for performance testing. To omit privacy and performance concerns, data travels directly from the Storage Mover agent to the target storage in Azure. Only control and telemetry messages are sent to the cloud service.

A diagram illustrating a migration's path by showing two arrows. The first arrow for data traveling to a storage account from the source/agent and a second arrow for only the management/control info to the storage mover resource/service.

The following table describes the characteristics of the test environment that produced the performance test results shared later in this article.

Test Result
Test namespace 19% files 0 KiB - 1 KiB
57% files 1 KiB - 16 KiB
16% files 16 KiB - 1 MiB
6% folders
Test source device Linux server VM
16 virtual CPU cores
64-GiB RAM
Test source share NFS v3.0 share
Warm cache: Data set in memory (baseline test). In real-world scenarios, add disk recall times.
Network Dedicated, over-provisioned configuration, negligible latency. No bottle neck between source - agent - target Azure storage.)

Performance baselines

These test results are created under ideal conditions. They're meant as a baseline of the components the Storage Mover service and agent can directly influence. Differences in source devices, disks, and network connections aren't considered in this test. Real-world performance will vary.

Different agent resource configurations are tested:

4 virtual CPU cores at 2.7 GHz each and 8 GiB of memory (RAM) is the minimum specification for an Azure Storage Mover agent.

Test Single file, 1 TiB ˜3.3M files, ˜200-K folders, ˜45 GiB ˜50M files, ˜3M folders, ˜1 TiB
Elapsed time 16 Min, 42 Sec 15 Min, 18 Sec 5 Hours, 28 Min
Items* per Second - 3548 2860
Memory (RAM) usage 400 MiB 1.4 GiB 1.4 GiB
Disk usage (for logs) 28 KiB 1.4 GiB result missing

*A namespace item is either a file or a folder.

Review recommended agent resources for your migration scope in the agent deployment article.

Why migration performance varies

Fundamentally, network quality and the ability to process files, folders and their metadata impact your migration velocity.

Across the two core areas of network and compute, several aspects have an impact:

  • Migration scenario
    Copy into an empty target is faster as compared to a target with content. This is because the migration engine must evaluate not only the source but also the target to make copy decisions.
  • Namespace item count
    Migrating 1 GiB of small files will take more time than migrating 1 GiB of larger files.
  • Namespace shape
    A wide folder hierarchy lends itself to more parallel processing than a narrow or deep directory structure. The file to folder ratio also plays a roll.
  • Namespace churn
    How many files, folders, and metadata have changed between two copy runs from the same source to the same target.
  • Network
    • bandwidth and latency between source and migration agent
    • bandwidth and latency between migration agent and the target in Azure
  • Migration agent resources
    The amount of memory (RAM), number of compute cores, and even the amount of available, local disk capacity on the migration agent can have a profound impact on the migration velocity. More compute resources help to optimize the utilization of the available bandwidth, especially when large amounts of smaller files need to be processed in a migration.

For example, a traditional migration requires a strategy to minimize downtime of the workload that depends on the storage that is to be migrated. Azure Storage Mover supports such a strategy. It's called convergent, n-pass migration.

In this strategy, you copy from source to target several times. During these copy iterations, the source remains available for read and write to the workload. Just before the final copy iteration, you take the source offline. It's expected that the final copy finishes faster than say the first copy you've ever made and takes about as long as the one immediately preceding it. After the final copy, the workload is failed over to use the new target storage in Azure and available for use again.

During the first copy from source to target, the target is likely empty and all the source content must travel to the target. As a result, the first copy is likely most constrained by the available network resources.

Towards the end of a migration, when you've copied the source to the target several times already, only a few files, folders, and metadata has changed since the last copy. In this last copy iteration, comparing each file in source and target to see if it needs to be updated, requires more compute resources and fewer network resources. Copy runs in this late stage of a migration are often more compute-constrained. Proper resourcing of the Storage Mover agent becomes more and more important.

Next steps

The following articles can help with a successful Azure Storage Mover deployment.