HBv2-series virtual machine sizes

Applies to: ✔️ Linux VMs ✔️ Windows VMs ✔️ Flexible scale sets ✔️ Uniform scale sets

Several performance tests have been run on HBv2-series size VMs. The following are some of the results of this performance testing.

Workload HBv2
STREAM Triad 350 GB/s (21-23 GB/s per CCX)
High-Performance Linpack (HPL) 4 TeraFLOPS (Rpeak, FP64), 8 TeraFLOPS (Rmax, FP32)
RDMA latency & bandwidth 1.2 microseconds, 190 Gb/s
FIO on local NVMe SSD 2.7 GB/s reads, 1.1 GB/s writes; 102k IOPS reads, 115 IOPS writes
IOR on 8 * Azure Premium SSD (P40 Managed Disks, RAID0)** 1.3 GB/s reads, 2.5 GB/writes; 101k IOPS reads, 105k IOPS writes

MPI latency

MPI latency test from the OSU microbenchmark suite is run. Sample scripts are on GitHub.

./bin/mpirun_rsh -np 2 -hostfile ~/hostfile MV2_CPU_MAPPING=[INSERT CORE #] ./osu_latency

Screenshot of MPI latency.

MPI bandwidth

MPI bandwidth test from the OSU microbenchmark suite is run. Sample scripts are on GitHub.

./mvapich2-2.3.install/bin/mpirun_rsh -np 2 -hostfile ~/hostfile MV2_CPU_MAPPING=[INSERT CORE #] ./mvapich2-2.3/osu_benchmarks/mpi/pt2pt/osu_bw

Screenshot of MPI bandwidth.

Mellanox Perftest

The Mellanox Perftest package has many InfiniBand tests such as latency (ib_send_lat) and bandwidth (ib_send_bw). An example command is below.

numactl --physcpubind=[INSERT CORE #]  ib_send_lat -a

Next steps