NDm_A100_v4 sizes series

The NDm A100 v4 series virtual machine(VM) is a new flagship addition to the Azure GPU family. These sizes are designed for high-end Deep Learning training and tightly coupled scale-up and scale-out HPC workloads.

The NDm A100 v4 series starts with a single VM and eight NVIDIA Ampere A100 80GB Tensor Core GPUs. NDm A100 v4-based deployments can scale up to thousands of GPUs with an 1.6 TB/s of interconnect bandwidth per VM. Each GPU within the VM is provided with its own dedicated, topology-agnostic 200 GB/s NVIDIA Mellanox HDR InfiniBand connection. These connections are automatically configured between VMs occupying the same Azure Virtual Machine Scale Set, and support GPU Direct RDMA.

Each GPU features NVLINK 3.0 connectivity for communication within the VM with 96 physical 2nd-generation AMD Epyc™ 7V12 (Rome) CPU cores behind them.

These instances provide excellent performance for many AI, ML, and analytics tools that support GPU acceleration 'out-of-the-box,' such as TensorFlow, Pytorch, Caffe, RAPIDS, and other frameworks. Additionally, the scale-out InfiniBand interconnect supports a large set of existing AI and HPC tools that are built on NVIDIA's NCCL2 communication libraries for seamless clustering of GPUs.

Host specifications

Part Quantity
Count Units
Specs
SKU ID, Performance Units, etc.
Processor 96 vCPUs AMD EPYC 7V12 (Rome) [x86-64]
Memory 1900 GiB
Local Storage 1 Disk 6400 GiB
Remote Storage 32 Disks 80000 IOPS
800 MBps
Network 8 NICs 24000 Mbps
Accelerators 8 GPUs Nvidia PCIe A100 GPU (80GB)

Feature support

Premium Storage: Supported
Premium Storage caching: Supported
Live Migration: Not Supported
Memory Preserving Updates: Not Supported
Generation 2 VMs: Supported
Generation 1 VMs: Not Supported
Accelerated Networking: Supported
Ephemeral OS Disk: Supported
Nested Virtualization: Not Supported

Sizes in series

vCPUs (Qty.) and Memory for each size

Size Name vCPUs (Qty.) Memory (GB)
Standard_ND96amsr_A100_v4 96 1900

VM Basics resources

Other size information

List of all available sizes: Sizes

Pricing Calculator: Pricing Calculator

Information on Disk Types: Disk Types

Next steps

Learn more about how Azure compute units (ACU) can help you compare compute performance across Azure SKUs.

Check out Azure Dedicated Hosts for physical servers able to host one or more virtual machines assigned to one Azure subscription.

Learn how to Monitor Azure virtual machines.