Problem with MS MPI

Paul Eq 10 Reputation points
2023-04-08T21:19:10.3466667+00:00

Hi everyone, I need help to configure my MPI Cluster and execute python code on nodes, could you help me please?

What I'd like to do:

  • I've 2 computers running on Windows 10 (node 1 & node 2)
  • I'd like to create a MPI cluster with 2 nodes to execute python code both on node 1 & 2 (computer 1 and computer 2.)
  • Node 1 would be the master and node 2 a client.

What I've already done:

  • I've installed Microsoft MPI v10.1.2 on both node 1 and node 2. (msmpisetup.exe and msmpisdk.msi)
  • I disabled firewall on both nodes.
  • I created a shared folder (from node 1) on my network on which there is the python code and both nodes have access to it
  • I use the same Microsoft account on both nodes (Windows sessions created from Outlook email and sessions have Admin rights)

What I can do (and that works):

  • I can ping node 2 from node 1 and vice versa
    Node 1 IP: 192.168.1.21
    Node 2 IP: 192.168.1.58
  • I can use telnet command to connect node 1 to node 2 on mpi ports (8676-8677) and vice versa
  • I can locally on each node start a python code using mpiexec -n 1 python \\STUDENT-LAPTOP\share_mpi\test_mpi.py that prompt Hello from process 0 of 1

What is the problem

I start smpd service on node 2 (everythings is made under Admin rights)
smpd -d 3 [-1:1072] Launching SMPD service. [-1:1072] smpd listening on port 8677 When I ask the node 1 to execute code on node 2 using:
mpiexec -host 192.168.1.58 1 python \\STUDENT-LAPTOP\share_mpi\test_mpi.py or with mpiexec -host 192.168.1.58 1 -p 8677 python \\STUDENT-LAPTOP\share_mpi\test_mpi.py I've this error:

ERROR: Failed RpcCliCreateContext error 5 Aborting: mpiexec on STUDENT-LAPTOP is unable to connect to the smpd service on 192.168.1.58:8677 Other MPI error, error stack: connect failed - Access is denied. (errno 5)

RPC Server is running on both node (i use sc query RpcSs to check it)
For more details I ran:

mpiexec -d 3 -host 192.168.1.58 1 -p 8677 python \\STUDENT-LAPTOP\share_mpi\test_mpi.py

And had this error: [00:9348] host tree: [00:9348] host: 192.168.1.58, parent: 0, id: 1 [00:9348] mpiexec started smpd manager listening on port 50874 [00:9348] using spn RestrictedKrbHost/192.168.1.58 to contact server [00:9348] Previous attempt failed with error 5, trying to authenticate without Kerberos [00:9348] ERROR: Failed RpcCliCreateContext error 5 Aborting: mpiexec on STUDENT-LAPTOP is unable to connect to the smpd service on 192.168.1.58:8677 Other MPI error, error stack: connect failed - Access is denied. (errno 5) [00:9348] smpd manager successfully stopped listening.
What can I do? I would really appreciate your help, thank you in advance.

Windows 10
Windows 10
A Microsoft operating system that runs on personal computers and tablets.
11,825 questions
{count} vote

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.