How to run a virtual MPI cluster on a windows machine?

kratia 0 Reputation points
2023-03-30T16:02:09.3033333+00:00

Question

I am a total newbie at this topic and was just referencing from this video where the presenter is showing how to set up a MPI cluster on two Linux running VMs. My question is cant we do something similar with a windows machine?

I asked around a bit and(from chatgpt) and got some crude idea on how we can make this possible -

Here were the steps that I got from some asking around -

    • Create a virtual network switch: In Hyper-V Manager, click on
    "Virtual Switch Manager" in the right pane, and then click "New virtual network switch" in the left pane. Choose the "Internal" network type and give it a name, such as "MPI Network". Click "OK" to create the virtual switch.
    • Create virtual machines: Create virtual machines for each node in
    your MPI cluster. Make sure to use the MPI network switch created in step 1 for the network adapter of each virtual machine. You can use any operating system you like, as long as you install an MPI implementation that supports it.
    • Install MPI software: Install the same MPI implementation on each
    virtual machine. For example, you can download and install Microsoft MPI on each machine.
    • Configure the network: In each virtual machine, configure the network
    settings to use a static IP address on the MPI network. For example, you can set the IP address to "192.168.0.1", "192.168.0.3", etc., with a subnet mask of "255.255.255.0". Make sure to use a unique IP address for each virtual machine.
    • Enable remote desktop: Enable Remote Desktop on each virtual machine,
    so that you can connect to them from the host machine or from other virtual machines.
    • Configure the firewall: If the Windows Firewall is enabled on any of
    the virtual machines, make sure to allow incoming connections on the MPI network.
    • Test the network connection: From each virtual machine, ping the
    other virtual machines on the MPI network to make sure that the network connection is working correctly.
    • Create an MPI hostfile: On one of the virtual machines, create a text
    file called "hostfile" that lists the IP addresses of all the virtual machines in your cluster, one per line.
    • Run the MPI program: On the virtual machine where you created the
    hostfile, open a command prompt and navigate to the directory where your MPI program is located. Then use the mpiexec command to run the program with the hostfile

Results

I followed these steps and at the end my setup looks something like this -

  • A host machine with Windows 11 pro running and a VM with Windows 10 running.
  • An internal virtual switch configured into my VM.
  • MS_MPI installed on both of my machines with the same version and configured properly(I ran my program on both of my machines independently to check if everything was configured right).
  • I disabled the firewall on both of my machines and cross checked if my machines were connected by pinging out to each other's IP addresses.
  • I created a shared folder in my host machine wherein I had my hostfile with the IP address of my guest machine and my mpi executable and I mounted this shared folder on my guest VM.

Execution

  • I executed the command -
smpd -d 3

On my guest machine and got the following output -

Output-1

  • While in my host machine in my shared folder directory I executed the following command -
mpiexec -d 3 -machinefile hostfile.txt test_mpi.exe

In the hostfile in my host machine I just have the IP address of my guest VM which is -

192.168.10.2

On running this command I get this following error -

Output-2

  • All the while I am able to communicate with the guest VM in the internal network by pinging to 192.168.10.2

If anybody could point out to some missing steps in my execution or could even confirm if something like this is possible or not, it would be of huge help for me.

Windows for business | Windows Client for IT Pros | Storage high availability | Virtualization and Hyper-V
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Limitless Technology 45,126 Reputation points
    2023-03-31T10:20:37.82+00:00

    Hello there,

    Based on the error message please check if you have necessary permission in executing these commands.

    Make sure all the machines you are trying to run the executable on, has the same version of MPI. Recommended is MPICH2.

    The hosts file of manager should contain the local network IP address entries of manager and all of the worker nodes. For each of the workers, you need to have the IP address entry of manager and the corresponding worker node.

    You are gonna need to communicate between the computers and you don’t want to type in the IP addresses every so often. Instead, you can give a name to the various nodes in the network that you wish to communicate with. hosts file is used by your device operating system to map hostnames to IP addresses.

    $ cat /etc/hosts

    127.0.0.1 localhost

    172.50.88.34 worker

    The worker here is the machine you’d like to do your computation with. Likewise, do the same about manager in the worker.

    Hope this resolves your Query !!

    --If the reply is helpful, please Upvote and Accept it as an answer--

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.