Introducing MPI support for Linux on Azure Batch
We are happy to announce the release of MPI support for Linux on Azure Batch.
We earlier released MPI support for Windows, and recently introduced Linux support on Azure Batch. We now extend Batch to enable MPI for Linux. Linux Support on Azure Batch is in preview currently and will be generally available soon. Additionally, we also provide password-less SSH on communication-enabled Linux pools for a specific user.
By creating a pool of A8 or A9 compute nodes, Batch MPI tasks can fully leverage the high-speed, low-latency RDMA network for those Azure VMs.
To run multi-instance (MPI) tasks, your Batch pool needs to be communication-enabled (“enableInterNodeCommunication = true”) with "maxTasksPerNode set to 1". Additionally, all nodes in the pool should have MPI installed (OpenMPI, IntelMPI or any other MPI installer). You can use a Start Task to create a pool that installs MPI on the nodes, for which we provide a step-by-step later in this blog post.
Below, we present an end-to-end example of running an OpenFOAM MPI solution on Azure Batch. In it, we use IntelMPI on CentOS-HPC. Later we also cover Ubuntu and CentOS with OpenMPI using a HelloWorld MPI solution.
OpenFOAM ("Open source Field Operation And Manipulation", wiki) is a C++ toolbox for the development of customized numerical solvers, and pre-/post-processing utilities for the solution of continuum mechanics problems, including computational fluid dynamics (CFD). It is a highly compute intensive application suitable for MPI.
Step-by-step: OpenFOAM MPI toolbox with CentOS-HPC on Batch using IntelMPI
The CentOS-HPC image from the Azure Virtual Machines Marketplace has Intel MPI in-built. We use this image to create a Batch pool of Standard_A9 VMs. We use the Azure Batch-exposed Start Task facility to prepare the pool of VMs with prerequisites. We then add a MPI task to run OpenFOAM. The multi-instance task is configured to run a coordination command on all participating nodes. The coordination command sets up and prepares a NFS share with input data required to run OpenFOAM. After running the coordination command, the Batch service runs the application command on the head (or “primary”) node. The application command executes mpirun on OpenFOAM. The primary or head node compresses the result and could upload it to Azure storage. We render/generate video using Paraview software.
Prerequisites
For this example we will use the Azure portal, thus making it a no-code example. However, the same can be done using the REST API or client libraries as shown in the sample code at the end of this post. To try this case study yourself you will require the following as pre-requisite.
- Azure account: If you don't already have an Azure subscription, create a free Azure account.
- Batch account: Once you have an Azure subscription, create an Azure Batch account.
- Storage account: See Create a storage account in About Azure storage accounts.
Also, create a blob container in your Storage account. You will upload input data to this container (explained in next section).
OpenFOAM input data preparation
Follow the OpenFOAM Build Guide to build input files for OpenFOAM . Two input files will be generated: OpenFOAM_CentOS7_HPC.tgz and OpenFOAM_CentOS7_HPC_libs.tgz
Follow the OpenFOAM Motorbike Tutorial to get a geometry file that we want to use for this case study, motorBike_3M.tgz
Upload these three input data files to the Storage container mentioned in the previous section.
Prepare the coordination command script
Copy the contents of the coordination-cmd script to a file on your local workstation:
In this file, replace “<storage-account>” and “<container>” in with your Storage account name and the name of the container to which you uploaded the prepared input data, and uncomment the wget commands.
Upload the modified coordination-cmd to the container in your Storage account, and use its blob URL in Task.multiInstanceSettings.commonResourceFiles.blobSource in later section.
Create CentOS-HPC pool (and prepare nodes for OpenFOAM)
Go to Azure portal. Select the Batch account that was created. Click Pools tile followed by Add.
Use the following options:
Pool.enableInterNodeCommunication | True |
Pool.maxTasksPerNode | 1 |
Pool.targetDedicated | 3 |
Pool.startTask.commandLine | nodeprep-cmd |
Pool.startTask.resourceFiles.blobSource | https://raw.githubusercontent.com/Azure/azure-batch-samples/master/Python/Batch/article_samples/mpi/data /linux/openfoam/nodeprep-cmd |
Pool.startTask.resourceFiles.filePath | nodeprep-cmd |
Pool.startTask.runElevated | True |
Pool.startTask.WaitForSuccess | True |
Create a Batch job
Similar to Pool above, now add a new Job.
Go to Azure portal –> Batch account –> Jobs –> Add Job
Submit MPI task for OpenFOAM
Go to Azure portal –> Batch account –> Jobs –> Job –> Add Task. Use the following options:
Task.commandLine | application-cmd 3 16 500 $AZ_BATCH_TASK_SHARED_DIR $AZ_BATCH_HOST_LIST |
Task.resourceFiles.blobSource | https://raw.githubusercontent.com/Azure/azure-batch-samples/master/Python/Batch/article_samples/mpi/data/linux/openfoam/application-cmd |
Task.resourceFiles.filePath | application-cmd |
Task.userIdentity | scope=pool, elevationLevel=admin |
Task.multiInstanceSettings.numberOfInstances | 3 |
Task.multiInstanceSettings.coordinationCommandLine | $AZ_BATCH_TASK_SHARED_DIR/coordination-cmd |
Task.multiInstanceSettings.commonResourceFiles.blobSource | https://raw.githubusercontent.com/Azure/azure-batch-samples/master/Python/Batch/article_samples/mpi/data/linux/openfoam/coordination-cmd |
Task.multiInstanceSettings.commonResourceFiles.filePath | coordination-cmd |
Download the output data
Once above task is submitted, Batch will run OpenFOAM on the input data. When the task completes, you can download the output file, VTK.tgz, by navigating to the task in the Azure portal:
Azure portal –> Batch account –> Jobs –> yourJobId –> Tasks -> yourTaskId –> Task Files -> Files on node
Download the VTK.tgz file and decompress it into a directory on your local workstation. You can use the Paraview software to render the video by running the command
"<Path-to-ParaView-Installation\bin\pvpython.exe" “<path-to-VTK-dir>\genimages.py”
OpenFOAM solution end-to-end sample code
The sample code for running an end-to-end OpenFOAM MPI solution on Batch is available here.
Run HelloWorld MPI application with Ubuntu on Batch using OpenMPI
Create Ubuntu pool (and install OpenMPI)
Pool.enableInterNodeCommunication | True |
Pool.maxTasksPerNode | 1 |
Pool.targetDedicated | 3 |
Pool.startTask.commandLine | /bin/sh -c "apt-get update; apt-get -y install libcr-dev mpich2 mpich2-doc" |
Pool.startTask.runElevated | True |
Pool.startTask.WaitForSuccess | True |
Submit MPI task on Ubuntu pool
Task.commandLine | mpirun -np 6 --host $AZ_BATCH_HOST_LIST -wdir $AZ_BATCH_TASK_SHARED_DIR $AZ_BATCH_TASK_SHARED_DIR/hello-world.exe |
Task.multiInstanceSettings.numberOfInstances | 3 |
Task.multiInstanceSettings.coordinationCommandLine | chmod +x $AZ_BATCH_TASK_SHARED_DIR/hello-world.exe |
Task.multiInstanceSettings.commonResourceFiles.blobSource | <storage-url-hosting-hello-world-exe> |
Task.multiInstanceSettings.commonResourceFiles.filePath | hello-world.exe |
Run HelloWorld MPI application with CentOS on Batch using OpenMPI
Create CentOS pool (and install OpenMPI)
Pool.enableInterNodeCommunication | True |
Pool.maxTasksPerNode | 1 |
Pool.targetDedicated | 3 |
Pool.startTask.commandLine | /bin/sh -c "yum install -y kernel-headers --disableexcludes=all; yum -y install make gcc gcc-c++ gcc-gfortran cmake zlib-devel openmpi openmpi-devel fftw fftw-devel gsl gsl-devel gmp environment-modules; source /etc/profile.d/modules.sh; module add mpi/openmpi-$(uname -i); module load mpi/openmpi-$(uname -i)" |
Pool.startTask.runElevated | True |
Pool.startTask.WaitForSuccess | True |
Submit MPI Task on CentOS pool
Task.commandLine | /usr/lib64/openmpi/bin/mpirun -np 6--host $AZ_BATCH_HOST_LIST-wdir $AZ_BATCH_TASK_SHARED_DIR $AZ_BATCH_TASK_SHARED_DIR/hello-world.exe |
Task.multiInstanceSettings.numberOfInstances | 3 |
Task.multiInstanceSettings.coordinationCommandLine | chmod +x $AZ_BATCH_TASK_SHARED_DIR/hello-world.exe |
Task.multiInstanceSettings.commonResourceFiles.blobSource | <storage-url-hosting-hello-world-exe> |
Task.multiInstanceSettings.commonResourceFiles.filePath | hello-world.exe |
We will also talk about other feature enhancements in a future series of blogs, so stay tuned to our blog.
If you have questions about MPI on Linux in Azure Batch, or about the examples in this blog post, please visit the Azure Batch forum and post your questions.