Bare Metal - MPI library on Azure Linux nodes
This post was published to Ninad Kanthi's WebLog at 21:33:08 10/07/2015
Bare Metal - MPI library on Azure Linux nodes
Step-By-Step guide
This article shows how a MPI applications can be setup using “bare metal” Linux nodes on Azure.
It must be emphasized that this article shows how easy it is to setup and configure the Azure environment to execute the MPI application but the configurations shown here are not recommended for a production environment. It is assumed that the reader has basic knowledge of Azure and medium to advanced knowledge on Linux, especially Ubuntu distros.
The techniques shown here have been extended from this Ubuntu article.
1. Creating a Cluster
As the first step, we will configure a Linux cluster under Azure. It will consist of four nodes and one of them will be the master node. The cluster will be created inside a VPN. To keep things very simple, we won’t create a DNS server and instead modify the /etc/hosts file.
The script for creating the Linux cluster is shown at the bottom of this article. Couple of things to note in the script are:
- We create Linux cluster from Ubuntu-14_04_2_LTS. This image is available from the Azure gallery.
- We will use NFS to created shared storage between the Linux nodes. In order for the NFS to be successful, required ports, 2049 and 111 are configured during the provisioning of the nodes.
After successful execution of creation script, you should see the Linux nodes configured under the VNET in your subscription as shown below
Figure 1: Linux nodes under Azure VPN
After the Linux nodes have been provisioned and up and running, we use PuTTY to establish a SSH connection and login to the node.
NOTE: The process of accessing and logging on to the provisioned Linux nodes is described in detailed here.
We will use the Linux node, linuxdistro-001, as the master node.
After logging onto the node, we edit the /etc/hosts file and add the nodes TCP endpoints to this file. This step is replicated across all nodes.
linuxuser@linuxdistro-001:~$ sudo vi /etc/hosts
After editing, the content of /etc/hosts file should look like following:
linuxuser@linuxdistro-001:~$ cat /etc/hosts
127.0.0.1 localhost
10.0.0.4 ub0
10.0.0.5 ub1
10.0.0.6 ub2
10.0.0.7 ub3
After repeating the above step across each node, each node can communicate with each other without the DNS being provisioned but please bear in mind that DNS is always a better option.
As we need to execute same commands across the Linux nodes, we will install ssh & pssh utilities on the master node.
Install pssh on the master node
linuxuser@linuxdistro-001:~$ sudo apt-get install ssh
linuxuser@linuxdistro-001:~$ sudo apt-get install pssh
Test that pssh is working correctly
linuxuser@linuxdistro-001:~$ parallel-ssh -i -A -H 10.0.0.5 -H 10.0.0.6 -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" echo hi
Note: To supress SSH warnings, we add the options x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" to every pssh command
2. Provisioning shared Storage
We will create /mirror folder under the master node and this will be shared across the nodes via NFS
Install nfs-client on all nodes except the master node
linuxuser@linuxdistro-001:~$ parallel-ssh -i -A -H 10.0.0.5 -H 10.0.0.6 -H 10.0.0.7 -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo apt-get install -y nfs-client
Create same folder structure on all nodes
linuxuser@linuxdistro-001:~$ parallel-ssh -i -A -h host_file -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo mkdir /mirror
Install the nfs-server on the master node
linuxuser@linuxdistro-001:~$ sudo apt-get install nfs-server
Ensure that the /mirror folder is set to share
linuxuser@linuxdistro-001:~$ echo "/mirror *(rw,sync)" | sudo tee -a /etc/exports
Restart the nfs share on the master node
linuxuser@linuxdistro-001:~$ sudo service nfs-kernel-server restart
Mount the share across the client nodes
linuxuser@linuxdistro-001:~$ parallel-ssh -i -A -H 10.0.0.5 -H 10.0.0.6 -H 10.0.0.7 -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo mount ub0:/mirror /mirror
Test the share across the client nodes.
linuxuser@linuxdistro-001:~$ sudo cp /etc/hosts /mirror
linuxuser@linuxdistro-001:~$ parallel-ssh -i -A -H 10.0.0.5 -H 10.0.0.6 -H 10.0.0.7 -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo ls /mirror
3. Setting up mutual trust for non-admin user
We will create a non-admin user – mpiu. This user needs to be created because Ubuntu does not support root trust. We also assign our shared folder, /mirror, as the home folder for this user. newusers command is used to create this user across all nodes.
Note: We use the newusers command to create the user across the various nodes as it provides the ability to create execute the command in non-interactive mode. The parameters for the user mpiu is specified in a file – /mirror/userfile.
Creating non-admin user
linuxuser@linuxdistro-001:~$ cd /mirror
linuxuser@linuxdistro-001:/mirror$ vi userfile
linuxuser@linuxdistro-001:/mirror$ cat userfile
mpiu:<password removed>:1002:1000:MPI user:/mirror:/bin/bash
linuxuser@linuxdistro-001:/mirror$ parallel-ssh -i -A -h host_file -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo newusers /mirror/userfile
Change the owner of shared folder to newly created user.
linuxuser@linuxdistro-001:/mirror$ sudo chown mpiu /mirror
Configuring password-less SSH communication across
1. Install ssh components on the master node
linuxuser@linuxdistro-001:~$ sudo apt-get install ssh
2. Next, Login with our newly created user
linuxuser@linuxdistro-001:/mirror$ su - mpiu
3. Generate an RSA key pair for user mpiu
mpiu@linuxdistro-001:~$ ssh-keygen –t rsa
4. Add this key to authorized keys file
mpiu@linuxdistro-001:~$ cd .ssh
mpiu@linuxdistro-001:~$ cat id_rsa.pub >> authorized_keys
5. As the home directory is common across all nodes (/mirror), there is no need to run these commands on all nodes
Test SSH run – it should not ask you for password for connecting to the machine.
mpiu@linuxdistro-001:~$ ssh ub1 hostname
4. MPI (HPC Application) installation and configuration
1. Login as the admin user
We logout from our existing mpiu session and we should be back in our admin – linuxuser – session.
mpiu@linuxdistro-001:~$ logout
linuxuser@linuxdistro-001:
2. Install GCC
We need a compiler to compile all the code. This needs to happen only on the master node.
linuxuser@linuxdistro-001: sudo apt-get install build-essential
3. Install MPICH2
The MPICH2 structure folder structure needs to be same across all nodes, therefore we execute the following command across all nodes/
linuxuser@linuxdistro-001: parallel-ssh -i -A -h /mirror/hosts_file -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo apt-get install -y mpich2
Note the content of /mirror/hosts_file is
linuxuser@linuxdistro-001:~$ cat ./host_file
10.0.0.4
10.0.0.5
10.0.0.6
10.0.0.7
4. Test installation
linuxuser@linuxdistro-001:/mirror$ su – mpiu
mpiu@linuxdistro-001:~$ which mpiexec
mpiu@linuxdistro-001:~$ which mpirun
5. Setting up machine file
Create a configuration file – nodefile – in mpiu’s home folde. Within the file specify the node names followed by the number of processes to spawn on each node. We have two cores per node, so we are going to specify 2 or 1 process per node.
mpiu@linuxdistro-001:~$ cat nodefile
ub0
ub1:2
ub2:2
ub3
mpiu@linuxdistro-001:~$
6. Create a testing MPICH2 C source code and compile.
Dummy mpi_hello.c program.
mpiu@linuxdistro-001:~$ cat mpi_hello.c
#include <stdio.h>
#include <mpi.h>
int main(int argc, char** argv) {
int myrank, nprocs;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
printf("Hello from processor %d of %d\n", myrank, nprocs);
MPI_Finalize();
return 0;
}
Compile the source code into an executable
mpiu@linuxdistro-001:~$ mpicc mpi_hello.c -o mpi_hello
5. Job Execution
mpiu@linuxdistro-001:~$ mpiexec -n 6 -f nodefile ./mpi_hello
6. Results capture and analysis
If everything is successful, you should see the following result.
mpiu@linuxdistro-001:~$ mpiexec -n 6 -f nodefile ./mpi_hello
Hello from processor 5 of 6
Hello from processor 0 of 6
Hello from processor 3 of 6
Hello from processor 1 of 6
Hello from processor 4 of 6
Hello from processor 2 of 6
mpiu@linuxdistro-001:~$
7. Clean the environment
One of the biggest advantage of cloud computing is to utilise the pay-as-you-go features of the service. What that means is that once the experiment and exercise is over, the environment could be completely torn-down and it costs nothing to the end-user.
Execute the script shown here to clean the Azure environment.
Appendix
Script - Provisioning the Linux cluster
#CreateLinuxVirtualMachines - In NON AVAILABILITY GROUPS
#azure config mode asm
# The Subscription name that we want to create our environment in
$SubscriptionNameId = "<enter your subscription id here>"
# Storage account name. This should BE CREATED BEFORE EXECUTING THE SCRIPT
$StorageAccountname = "ubuntuimages"
# AffinityGroup name. This should BE CREATED BEFORE EXECUTING THE SCRIPT
$AffinityGroupName = "linuxdistros"
# Network name. This should BE CREATED BEFORE EXECUTING THE SCRIPT
$VnetName = "nktr21"
# Subnetname. This should BE CREATED BEFORE EXECUTING THE SCRIPT
$SubnetName = "linuxcluster"
# Availability Set
$AvailabilitySetName = "linuxdistro"
# Cloud Service name. This service will be created
$CloudServiceName = "linuxcloudsrv"
# Instance size
$InstanceSize = "Medium"
# Linux Admin Username
$password = "<yourpassword>"
# Linux Admin Password
$username = "linuxuser"
# Name Prefix of the VM machine name. Numeric counter number is appended to this to create the final
$LinuxImageNamePrefix = "linuxdistro-00"
# Load the keys for us to login to
$LoadSettings = Import-AzurePublishSettingsFile "NinadKNew.publishsettings"
# Important to specify the CurrentStorageAccountName, especially otherwise you might get Storage not accessible error when creating linux machines.
Set-AzureSubscription -SubscriptionId $SubscriptionNameId -CurrentStorageAccountName $StorageAccountname -ErrorAction Stop
Select-AzureSubscription -SubscriptionId $SubscriptionNameId -ErrorAction Stop
# Get the image from Azure repository that we want to create. Its the Ubuntu 14_04_LTS variant. We can speed up the creation script by caching the name.
# the name thus obtianed is 'b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-14_04_2_LTS-amd64-server-20150309-en-us-30GB'
#$UbuntuImage = Get-AzureVMImage | where {($_.ImageName -match 'Ubuntu-14_04') -and ($_.ImageName -match '2015') -and ($_.ImageName -match 'LTS') -and ($_.Label -eq 'Ubuntu Server 14.04.2.LTS')} | Select ImageName
#Write-Host $UbuntuImage[0].ImageName
# Guid of the image name that we want to create
$ImageNameGuid = "b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-14_04_2_LTS-amd64-server-20150309-en-us-30GB"
# If the service does not exist, we'll create one.
$GetCloudService = Get-AzureService -ServiceName $CloudServiceName -Verbose -ErrorAction Continue
if (!$GetCloudService )
{
# service does not exist.
$CreateCloudService = New-AzureService -ServiceName $CloudServiceName -Label "Created from Windows power shell" -Description '16 June 2015'
Write-Host ("Return value from creating the cloud service = {0}" -f $CreateCloudService.OperationStatus.ToString())
}
$PortmapPort = 2049
$NfsPort = 111
$Portcounter = 0
$Counter = 1
# Loop to create the Linux machines.
do
{
# Prepend the VM name
$LinuxImagename = $LinuxImageNamePrefix + $Counter.ToString()
# Configure VM by specifying VMName, Instance Size, ImageName, and specify AzureSubnet
$VMNew = New-AzureVMConfig -Name $LinuxImagename -InstanceSize $InstanceSize -ImageName $ImageNameGuid
# Add the username and password to the VM creation configuration
$VMNew | Add-AzureProvisioningConfig -Linux -LinuxUser $username -Password $password | Set-AzureSubnet $SubnetName
# Create and start the VM. Remember, it won't be fully provisioned when the call returns.
$Result = $VMNew | New-AzureVM -ServiceName $CloudServiceName -AffinityGroup $AffinityGroupName -VNetName $VnetName
Write-Host ("Created VM Image {0}, return value = {1}" -f $LinuxImagename, $Result.OperationStatus )
$Counter++
$Portcounter++;
}
while ($Counter -le 4)
Script - Removing the Linux cluster
# CleanLinuxVirtualMachines
# The Subscription name that we want to create our environment in
$SubscriptionNameId = "<your subscription id>"
# Cloud Service name. This service will be created
$CloudServiceName = "linuxcloudsrv"
# Name Prefix of the VM machine name. Numeric counter number is appended to this to create the final
$LinuxImageNamePrefix = "linuxdistro-00"
# Load the keys for us to login to
Import-AzurePublishSettingsFile "NinadKNew.publishsettings"
Set-AzureSubscription -SubscriptionId $SubscriptionNameId -ErrorAction Stop
Select-AzureSubscription -SubscriptionId $SubscriptionNameId -ErrorAction Stop
$Counter = 1
$AllVmsStopped = 1
$OkToRemoveVM = 0
do
{
$OkToRemoveVM = 0
# Prepend the VM name
$LinuxImagename = $LinuxImageNamePrefix + $Counter.ToString()
Write-Host ("VM Image name = {0}" -f $LinuxImagename)
$VMPresent = Get-AzureVM -ServiceName $CloudServiceName -Name $LinuxImageName
if ($VMPresent)
{
if ($VMPresent.Status -eq "StoppedVM" -or $VMPresent.Status -eq "StoppedDeallocated" )
{
Write-Host ("{0} VM is either stopped or StoppedDeallocated" -f $LinuxImagename )
$OkToRemoveVM =1
}
else
{
Write-Host ("[Stopping] VM {0}" -f $VMPresent.Name)
$StopVM = Stop-AzureVM -Name $VMPresent.Name -ServiceName $VMPresent.ServiceName -Force
if ($StopVM.OperationStatus.Equals('Succeeded'))
{
$OkToRemoveVM =1
}
else
{
Write-Host ("Not able to stop virtual machine {0}, cloud service will not be removed" -f $VMPresent.Name)
$AllVmsStopped = 0
}
}
if ( $OkToRemoveVM -eq 1)
{
Write-Host ("[Removing] virtual machine {0}" -f $VMPresent.Name)
Remove-AzureVM -Name $VMPresent.Name -ServiceName $VMPresent.ServiceName -ErrorAction Continue
}
}
else
{
Write-Warning ("No Vm found " -f $LinuxImageName)
}
$Counter++
}
while ($Counter -le 4)