Caution
This article references CentOS, a Linux distribution that is nearing End Of Life (EOL) status. Please consider your use and plan accordingly. For more information, see the CentOS End Of Life guidance.
This article briefly describes the steps for installing and running LAMMPS on a virtual machine (VM) that's deployed on Azure. It also presents the performance results of running LAMMPS on single-node and multi-node VM configurations.
LAMMPS is a classical molecular dynamics simulator that's used for materials modeling. It can model solid-state materials, soft matter, and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.
LAAMPS is designed to run well on parallel machines, but it also runs on single-processor desktop machines. It's composed of modular code, and most of its functionality is in optional packages.
Typical LAMMPS simulations include all-atom models of liquids, solids, and explicit solvents.
Why deploy LAMMPS on Azure?
- Modern and diverse compute options to meet your workload's needs.
- The flexibility of virtualization without the need to buy and maintain physical hardware.
- Rapid provisioning.
- Support for Message Passing Interface (MPI).
- Ability to run large and time-consuming jobs.
Architecture
This diagram shows a multi-node configuration:
Download a Visio file of this architecture.
This diagram shows a single-node configuration:
Download a Visio file of this architecture.
Components
- Azure Virtual Machines is used to create Linux VMs.
- For information about deploying VMs, see Linux VMs on Azure.
- Azure Virtual Network is used to create a private network infrastructure in the cloud.
- Network security groups restrict access to VMs.
- A public IP address connects the internet to VMs.
- Azure CycleCloud is used to create the cluster in the multi-node configuration.
- A physical SSD provides storage.
Compute sizing
Performance tests of LAMMPS on Azure used HBv3 AMD EPYC 7V73X (Milan-X) VMs running Linux CentOS. The following table provides details about HBv3-series VMs.
VM size | vCPU | Memory (GiB) | Memory bandwidth (GBps) | Base CPU frequency (GHz) | All-cores frequency (GHz, peak) | Single-core frequency (GHz, peak) | RDMA performance (Gbps) | Maximum data disks |
---|---|---|---|---|---|---|---|---|
Standard_HB120rs_v3 | 120 | 448 | 350 | 1.9 | 3.0 | 3.5 | 200 | 32 |
Standard_HB120-96rs_v3 | 96 | 448 | 350 | 1.9 | 3.0 | 3.5 | 200 | 32 |
Standard_HB120-64rs_v3 | 64 | 448 | 350 | 1.9 | 3.0 | 3.5 | 200 | 32 |
Standard_HB120-32rs_v3 | 32 | 448 | 350 | 1.9 | 3.0 | 3.5 | 200 | 32 |
Standard_HB120-16rs_v3 | 16 | 448 | 350 | 1.9 | 3.0 | 3.5 | 200 | 32 |
Install LAMMPS on a VM or HPC cluster
You can download the software from the LAMMPS website. You just need to untar or unzip the LAMMPS binary distribution file, and you can run LAMMPS directly in the resulting directory. For a guide to building from source code, see Build LAMMPS.
Before you install LAMMPS, you need to deploy and connect to a VM or HPC cluster.
For information about deploying a VM, see Run a Linux VM on Azure.
For information about deploying the Azure CycleCloud and HPC cluster, see these resources:
Install LAMMPS
Complete the following steps to install LAMMPS on single-node and cluster VMs.
Run the following commands:
export PATH=$PATH:/opt/openmpi-4.1.0/bin/ export LD_LIBRARY_PATH=/opt/openmpi-4.1.0/lib export CC=gcc export CXX=g++ export FC=gfortran export FCFLAGS=-m64 export F77=gfortran export F90=ifort export CPPFLAGS=-DpggFortran
Download the source code from LAMMPS.
Unzip the file:
tar xvf *
Locate the LAMMPS folder:
cd lammps-<version> cd src
To build LAMMPS, run these commands in the src folder:
make yes-rigid make serial make mpi
Run LAMMPS
To run LAMMPS on a standalone VM, use these commands:
export PATH=$PATH:/opt/openmpi-4.1.0/bin/ export LD_LIBRARY_PATH=/opt/openmpi-4.1.0/lib export LMP_MPI=/path/LAMMPS/lammps-<version>/src/lmp_mpi mpirun -np 16 /path/LAMMPS/lammps-<version>/src/lmp_mpi -in in.lj
To run LAMMPS on a multi-node cluster, use this script:
1 #!/bin/bash 2 #SBATCH --job-name=LAMMPS 3 #SBATCH --partition=hpc 4 #SBATCH --nodes=2 5 #SBATCH --ntasks-per-node=64 6 #SBATCH --ntasks=128 7 export PATH=$PATH:/opt/openmpi-4.1.0/bin/ 8 export LD_LIBRARY_PATH=/opt/openmpi-4.1.0/lib 9 export LMP_MPI=/path/LAMMPS/lammps-<version>/src/lmp_mpi 10 mpirun -np 64 /path/LAMMPS/lammps-<version>/src/lmp_mpi -in benchmark.in
Note
In the preceding script,
ntasks
on line 6 is the number of nodes multiplied by the number of cores per VM configuration. The number of nodes is 2, as specified on line 4. The number of cores per VM configuration is 64, as specified on line 5. Sontasks
is 128.
LAMMPS performance results on Azure VMs
Two models were used to test the performance of LAMMPS version 23 and LAMMPS version 17 on Azure.
Lennard-Jones model
Lennard-Jones (in.lj) is a simple molecular dynamics simulation of a binary fluid in the NVT ensemble. It's made of neutral dots with a Langevin thermostating.
The following table provides details about the Lennard-Jones model:
Number of atoms | Timestep | Thermo step | Run steps |
---|---|---|---|
1.0e+9 | 0.1 | 10 | 200 |
HECBioSim model
HECBioSim is a benchmark suite that consists of a set of simple benchmarks for a number of popular molecular dynamics engines, each of which is set at a different atom count.
The following table provides details about the HECBioSim model:
Number of atoms | Timestep | Thermo step | Run steps |
---|---|---|---|
1,403,180 | 2.0 | 5,000 | 10,000 |
LAMMPS performance results on single-node VMs
The following sections provide the performance results of running LAMMPS version 23 on single-node Azure HBv3 AMD EPYC 7V73X (Milan-X) VMs. The Lennard-Jones model is used in these tests.
This table shows the total wall clock times recorded for various numbers of CPUs on the Standard HBv3-series VM:
Number of cores | Wall clock time (seconds) | Relative speed increase |
---|---|---|
16 | 7,634 | 1 |
32 | 4,412 | 1.73 |
64 | 2,102 | 3.63 |
96 | 1,648 | 4.63 |
120 | 1,445 | 5.28 |
The following graph shows the relative speed increases as the number of CPUs increases:
Notes about the single-node tests
For the single-node tests, the Standard_HB120-16rs_v3 VM (16 cores) is used as a baseline to calculate relative speed increases as the number of cores increases. The results show that parallel performance improves as the number of cores increases from 16 to 120. A speed increase of 5.3x is achieved with 120 cores.
LAMMPS performance results on multi-node clusters
The single-node tests show that optimal parallel performance is reached with 64 cores on HBv3 VMs. Based on those results, 64-core configurations on Standard_HB120-64rs_v3 VMs are used to evaluate the performance of LAMMPS on multi-node clusters. The Lennard-Jones and HECBioSim models are used for the multi-node tests.
Lennard-Jones model
This table shows the total wall clock times recorded for various numbers of nodes:
Number of nodes | Number of cores | Wall clock time (seconds) | Relative speed increase |
---|---|---|---|
1 | 64 | 2,612 | N/A |
2 | 128 | 1,573 | 1.66 |
4 | 256 | 1,035 | 2.52 |
8 | 512 | 793 | 3.29 |
The following graph shows the relative speed increases as the number of nodes increases:
HECBioSim model
This table shows the total wall clock times recorded for various numbers of nodes:
Number of nodes | Number of cores | Wall clock time (seconds) | Relative speed increase |
---|---|---|---|
1 | 64 | 3,103 | N/A |
2 | 128 | 1,601 | 1.94 |
4 | 256 | 840 | 3.69 |
8 | 512 | 442 | 7.02 |
16 | 1,024 | 241 | 12.88 |
The following graph shows the relative speed increases as the number of nodes increases:
Notes about the multi-node tests
- The multi-node results show that both models scale well when you increase the number of nodes.
- The Lennard-Jones model was tested with LAMMPS version 23. The HECBioSim model was tested with LAMMPS version 17.
Azure cost
The following tables provide wall clock times that you can use to calculate Azure costs. To compute the cost, multiply the wall clock time by the number of nodes and the Azure VM hourly rate. For the hourly rates for Linux, see Linux Virtual Machines Pricing. Azure VM hourly rates are subject to change.
Only simulation running time is considered for the cost calculations. Installation time, simulation setup time, and software costs aren't included.
You can use the Azure pricing calculator to estimate VM costs for your configurations.
Running times for the Lennard-Jones model
Number of nodes | Wall clock time (hours) |
---|---|
1 | 0.73 |
2 | 0.44 |
4 | 0.29 |
8 | 0.22 |
Running times for the HECBioSim model
Number of nodes | Wall clock time (hours) |
---|---|
1 | 0.86 |
2 | 0.44 |
4 | 0.23 |
8 | 0.12 |
16 | 0.07 |
Summary
- LAMMPS was successfully tested on HBv3 standalone VMs and Azure CycleCloud multi-node configurations with as many as 16 nodes.
- In multi-node configurations, tests indicate speed increases of about 3.29x for the Lennard-Jones model and about 12.88x for the HECBioSim model.
- For small simulations, we recommend that you use fewer CPUs to improve performance.
Contributors
This article is maintained by Microsoft. It was originally written by the following contributors.
Principal authors:
- Hari Bagudu | Senior Manager
- Gauhar Junnarkar | Principal Program Manager
- Amol Rane | HPC Performance Engineer
Other contributors:
- Mick Alberts | Technical Writer
- Guy Bursell | Director, Business Strategy
- Sachin Rastogi | Manager
To see non-public LinkedIn profiles, sign into LinkedIn.
Next steps
- GPU-optimized virtual machine sizes
- Virtual machines on Azure
- Virtual networks and virtual machines on Azure
- Learning path: Run HPC applications on Azure