ND-H200-v5 size series

2024-10-18

The ND v5 H200 series virtual machine (VM) is designed to deliver exceptional performance for AI and high-performance computing (HPC) workloads. These VMs leverage the power of the NVIDIA H200 Tensor Core GPU, which offer a 76% increase in High Bandwidth Memory over the H100 GPUs to deliver higher performance on state-of-the-art Generative AI models. With 141 GB of high-speed memory, and 4.8 TB/s of memory bandwidth, the H200 GPU can handle larger datasets and more complex models, making it ideal for generative AI and scientific computing.

The ND H200 v5 series starts with a single VM and eight NVIDIA H200 Tensor Core GPUs, interconnected with 900 GB/s NVLink. ND H200 v5-based deployments can scale up to thousands of GPUs with 3.2Tb/s of interconnect bandwidth per VM. Each GPU within the VM is provided with its own dedicated, topology-agnostic 400 Gb/s NVIDIA Quantum-2 CX7 InfiniBand connection. These connections are automatically configured between VMs occupying the same virtual machine scale set, and support GPUDirect RDMA.

These instances provide excellent performance for many AI, ML, and analytics tools that support GPU acceleration "out-of-the-box" such as TensorFlow, Pytorch, Caffe, RAPIDS, and other frameworks. Additionally, the scale-out InfiniBand interconnect is supported by a large set of existing AI and HPC tools that are built on NVIDIA’s NCCL communication libraries for seamless clustering of GPUs.

Host specifications

Part	Quantity ^{Count Units}	Specs ^{SKU ID, Performance Units, etc.}
Processor	96 vCPUs	Intel Xeon (Sapphire Rapids) [x86-64]
Memory	1850 GiB
Local Storage	1 Disk	28000 GiB
Remote Storage	16Disks
Network	8 NICs
Accelerators	8 GPUs	Nvidia H200 GPU (141GB)

Feature support

Premium Storage: Supported
Premium Storage caching: Supported
Live Migration: Not Supported
Memory Preserving Updates: Not Supported
Generation 2 VMs: Supported
Generation 1 VMs: Not Supported
Accelerated Networking: Supported
Ephemeral OS Disk: Supported
Nested Virtualization: Not Supported

Sizes in series

vCPUs (Qty.) and Memory for each size

Size Name	vCPUs (Qty.)	Memory (GB)
Standard_ND96isr_H200_v5	96	1850

VM Basics resources

Check vCPU quotas

Local (temp) storage info for each size

Size Name	Max Temp Storage Disks (Qty.)	Temp Disk Size (GiB)
Standard_ND96isr_H200_v5	1	28000

Storage resources

Table definitions

¹Temp disk speed often differs between RR (Random Read) and RW (Random Write) operations. RR operations are typically faster than RW operations. The RW speed is usually slower than the RR speed on series where only the RR speed value is listed.
Storage capacity is shown in units of GiB or 1024^3 bytes. When you compare disks measured in GB (1000^3 bytes) to disks measured in GiB (1024^3), remember that capacity numbers given in GiB may appear smaller. For example, 1023 GiB = 1098.4 GB.
Disk throughput is measured in input/output operations per second (IOPS) and MBps where MBps = 10^6 bytes/sec.
To learn how to get the best storage performance for your VMs, see Virtual machine and disk performance.

Remote (uncached) storage info for each size

Size Name	Max Remote Storage Disks (Qty.)	Uncached Disk IOPS	Uncached Disk Speed (MBps)
Standard_ND96isr_H200_v5	16	40800	612

Storage resources

Table definitions

¹Some sizes support bursting to temporarily increase disk performance. Burst speeds can be maintained for up to 30 minutes at a time.
²Special Storage refers to either Ultra Disk or Premium SSD v2 storage.
Storage capacity is shown in units of GiB or 1024^3 bytes. When you compare disks measured in GB (1000^3 bytes) to disks measured in GiB (1024^3), remember that capacity numbers given in GiB may appear smaller. For example, 1023 GiB = 1098.4 GB.
Disk throughput is measured in input/output operations per second (IOPS) and MBps where MBps = 10^6 bytes/sec.
Data disks can operate in cached or uncached modes. For cached data disk operation, the host cache mode is set to ReadOnly or ReadWrite. For uncached data disk operation, the host cache mode is set to None.
To learn how to get the best storage performance for your VMs, see Virtual machine and disk performance.

Network interface information for each size

Size Name	Max NICs (Qty.)	Max Bandwidth (Mbps)
Standard_ND96isr_H200_v5	8	80000

Networking resources

Table definitions

Expected network bandwidth is the maximum aggregated bandwidth allocated per VM type across all NICs, for all destinations. For more information, see Virtual machine network bandwidth.
Upper limits aren't guaranteed. Limits offer guidance for selecting the right VM type for the intended application. Actual network performance depends on several factors including network congestion, application loads, and network settings. For information on optimizing network throughput, see Optimize network throughput for Azure virtual machines.
To achieve the expected network performance on Linux or Windows, you may need to select a specific version or optimize your VM. For more information, see Bandwidth/Throughput testing (NTTTCP).

Size Name	Accelerators (Qty.)	Accelerator-Memory (GB)
Standard_ND96isr_H200_v5	8	1128

Other size information

List of all available sizes: Sizes

Pricing Calculator: Pricing Calculator

Information on Disk Types: Disk Types

Next steps

Take advantage of the latest performance and features available for your workloads by changing the size of a virtual machine.

Utilize Microsoft's in-house designed ARM processors with Azure Cobalt VMs.

Learn how to Monitor Azure virtual machines.

Share via

ND-H200-v5 size series

Host specifications

Feature support

Sizes in series

VM Basics resources

Other size information

Next steps

Feedback

Additional resources