Deploy LAMMPS on an Azure virtual machine

Azure Virtual Machines
Azure Virtual Network

Caution

This article references CentOS, a Linux distribution that is nearing End Of Life (EOL) status. Please consider your use and plan accordingly. For more information, see the CentOS End Of Life guidance.

This article briefly describes the steps for installing and running LAMMPS on a virtual machine (VM) that's deployed on Azure. It also presents the performance results of running LAMMPS on single-node and multi-node VM configurations.

LAMMPS is a classical molecular dynamics simulator that's used for materials modeling. It can model solid-state materials, soft matter, and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.

LAAMPS is designed to run well on parallel machines, but it also runs on single-processor desktop machines. It's composed of modular code, and most of its functionality is in optional packages.

Typical LAMMPS simulations include all-atom models of liquids, solids, and explicit solvents.

Why deploy LAMMPS on Azure?

  • Modern and diverse compute options to meet your workload's needs.
  • The flexibility of virtualization without the need to buy and maintain physical hardware.
  • Rapid provisioning.
  • Support for Message Passing Interface (MPI).
  • Ability to run large and time-consuming jobs.

Architecture

This diagram shows a multi-node configuration:

Diagram that shows a multi-node architecture for deploying LAMMPS.

Download a Visio file of this architecture.

This diagram shows a single-node configuration:

Diagram that shows a single-node architecture for deploying LAMMPS.

Download a Visio file of this architecture.

Components

Compute sizing

Performance tests of LAMMPS on Azure used HBv3 AMD EPYC 7V73X (Milan-X) VMs running Linux CentOS. The following table provides details about HBv3-series VMs.

VM size vCPU Memory (GiB) Memory bandwidth (GBps) Base CPU frequency (GHz) All-cores frequency (GHz, peak) Single-core frequency (GHz, peak) RDMA performance (Gbps) Maximum data disks
Standard_HB120rs_v3 120 448 350 1.9 3.0 3.5 200 32
Standard_HB120-96rs_v3 96 448 350 1.9 3.0 3.5 200 32
Standard_HB120-64rs_v3 64 448 350 1.9 3.0 3.5 200 32
Standard_HB120-32rs_v3 32 448 350 1.9 3.0 3.5 200 32
Standard_HB120-16rs_v3 16 448 350 1.9 3.0 3.5 200 32

Install LAMMPS on a VM or HPC cluster

You can download the software from the LAMMPS website. You just need to untar or unzip the LAMMPS binary distribution file, and you can run LAMMPS directly in the resulting directory. For a guide to building from source code, see Build LAMMPS.

Before you install LAMMPS, you need to deploy and connect to a VM or HPC cluster.

For information about deploying a VM, see Run a Linux VM on Azure.

For information about deploying the Azure CycleCloud and HPC cluster, see these resources:

Install LAMMPS

Complete the following steps to install LAMMPS on single-node and cluster VMs.

  1. Run the following commands:

    export PATH=$PATH:/opt/openmpi-4.1.0/bin/ 
    
    export LD_LIBRARY_PATH=/opt/openmpi-4.1.0/lib 
    
    export CC=gcc 
    
    export CXX=g++ 
    
    export FC=gfortran 
    
    export FCFLAGS=-m64 
    
    export F77=gfortran 
    
    export F90=ifort 
    
    export CPPFLAGS=-DpggFortran
    
  2. Download the source code from LAMMPS.

  3. Unzip the file:

    tar xvf *
    
  4. Locate the LAMMPS folder:

    cd lammps-<version> 
    
    cd src 
    
  5. To build LAMMPS, run these commands in the src folder:

    make yes-rigid 
    
    make serial 
    
    make mpi 
    

Run LAMMPS

  1. To run LAMMPS on a standalone VM, use these commands:

    export PATH=$PATH:/opt/openmpi-4.1.0/bin/ 
    
    export LD_LIBRARY_PATH=/opt/openmpi-4.1.0/lib 
    
    export LMP_MPI=/path/LAMMPS/lammps-<version>/src/lmp_mpi 
    
    mpirun -np 16 /path/LAMMPS/lammps-<version>/src/lmp_mpi -in in.lj 
    
  2. To run LAMMPS on a multi-node cluster, use this script:

    1   #!/bin/bash 
    
    2   #SBATCH --job-name=LAMMPS 
    
    3   #SBATCH --partition=hpc 
    
    4   #SBATCH --nodes=2 
    
    5   #SBATCH --ntasks-per-node=64 
    
    6   #SBATCH --ntasks=128 
    
    7   export PATH=$PATH:/opt/openmpi-4.1.0/bin/ 
    
    8   export LD_LIBRARY_PATH=/opt/openmpi-4.1.0/lib 
    
    9   export LMP_MPI=/path/LAMMPS/lammps-<version>/src/lmp_mpi 
    
    10  mpirun -np 64 /path/LAMMPS/lammps-<version>/src/lmp_mpi -in benchmark.in 
    

    Note

    In the preceding script, ntasks on line 6 is the number of nodes multiplied by the number of cores per VM configuration. The number of nodes is 2, as specified on line 4. The number of cores per VM configuration is 64, as specified on line 5. So ntasks is 128.

LAMMPS performance results on Azure VMs

Two models were used to test the performance of LAMMPS version 23 and LAMMPS version 17 on Azure.

Lennard-Jones model

Lennard-Jones (in.lj) is a simple molecular dynamics simulation of a binary fluid in the NVT ensemble. It's made of neutral dots with a Langevin thermostating.

The following table provides details about the Lennard-Jones model:

Number of atoms Timestep Thermo step Run steps
1.0e+9 0.1 10 200

HECBioSim model

HECBioSim is a benchmark suite that consists of a set of simple benchmarks for a number of popular molecular dynamics engines, each of which is set at a different atom count.

The following table provides details about the HECBioSim model:

Number of atoms Timestep Thermo step Run steps
1,403,180 2.0 5,000 10,000

LAMMPS performance results on single-node VMs

The following sections provide the performance results of running LAMMPS version 23 on single-node Azure HBv3 AMD EPYC 7V73X (Milan-X) VMs. The Lennard-Jones model is used in these tests.

This table shows the total wall clock times recorded for various numbers of CPUs on the Standard HBv3-series VM:

Number of cores Wall clock time (seconds) Relative speed increase
16 7,634 1
32 4,412 1.73
64 2,102 3.63
96 1,648 4.63
120 1,445 5.28

The following graph shows the relative speed increases as the number of CPUs increases:

Graph that shows the relative speed increases in a single-node configuration.

Notes about the single-node tests

For the single-node tests, the Standard_HB120-16rs_v3 VM (16 cores) is used as a baseline to calculate relative speed increases as the number of cores increases. The results show that parallel performance improves as the number of cores increases from 16 to 120. A speed increase of 5.3x is achieved with 120 cores.

LAMMPS performance results on multi-node clusters

The single-node tests show that optimal parallel performance is reached with 64 cores on HBv3 VMs. Based on those results, 64-core configurations on Standard_HB120-64rs_v3 VMs are used to evaluate the performance of LAMMPS on multi-node clusters. The Lennard-Jones and HECBioSim models are used for the multi-node tests.

Lennard-Jones model

This table shows the total wall clock times recorded for various numbers of nodes:

Number of nodes Number of cores Wall clock time (seconds) Relative speed increase
1 64 2,612 N/A
2 128 1,573 1.66
4 256 1,035 2.52
8 512 793 3.29

The following graph shows the relative speed increases as the number of nodes increases:

Graph that shows the relative speed increases for the Lennard-Jones model in a multi-node configuration.

HECBioSim model

This table shows the total wall clock times recorded for various numbers of nodes:

Number of nodes Number of cores Wall clock time (seconds) Relative speed increase
1 64 3,103 N/A
2 128 1,601 1.94
4 256 840 3.69
8 512 442 7.02
16 1,024 241 12.88

The following graph shows the relative speed increases as the number of nodes increases:

Graph that shows the relative speed increases for the HECBioSim model in a multi-node configuration.

Notes about the multi-node tests

  • The multi-node results show that both models scale well when you increase the number of nodes.
  • The Lennard-Jones model was tested with LAMMPS version 23. The HECBioSim model was tested with LAMMPS version 17.

Azure cost

The following tables provide wall clock times that you can use to calculate Azure costs. To compute the cost, multiply the wall clock time by the number of nodes and the Azure VM hourly rate. For the hourly rates for Linux, see Linux Virtual Machines Pricing. Azure VM hourly rates are subject to change.

Only simulation running time is considered for the cost calculations. Installation time, simulation setup time, and software costs aren't included.

You can use the Azure pricing calculator to estimate VM costs for your configurations.

Running times for the Lennard-Jones model

Number of nodes Wall clock time (hours)
1 0.73
2 0.44
4 0.29
8 0.22

Running times for the HECBioSim model

Number of nodes Wall clock time (hours)
1 0.86
2 0.44
4 0.23
8 0.12
16 0.07

Summary

  • LAMMPS was successfully tested on HBv3 standalone VMs and Azure CycleCloud multi-node configurations with as many as 16 nodes.
  • In multi-node configurations, tests indicate speed increases of about 3.29x for the Lennard-Jones model and about 12.88x for the HECBioSim model.
  • For small simulations, we recommend that you use fewer CPUs to improve performance.

Contributors

This article is maintained by Microsoft. It was originally written by the following contributors.

Principal authors:

Other contributors:

To see non-public LinkedIn profiles, sign into LinkedIn.

Next steps