How to dynamically create MPI processes based on available cores?

Ben Yu 1 Reputation point
2021-06-21T11:36:49.943+00:00

Hi,

Currently, I am developing an MPI program which is running on Microsoft HPC Cluster. In my case, I want this MPI program can create processes based on available cores, and make sure each core has exactly one process. For example, at first I submit my MPI program job with 10 cores (which will start 10 processes):

job submit /numcores:10 mpiexec.exe -n 10 MyMpiApp.exe

During running of the job, the other jobs finished and more cores available, let's say we have another 20 cores available. How can MyMpiApp.exe knows this information and spawn 20 more processes and running on the new 20 available cores? The result would be like I submit my job with the following command:

job submit /numcores:30 mpiexec.exe -n 30 MyMpiApp.exe

The reasons I would like to run MPI jobs like this are:

  1. I want to start my MPI job as soon as possible, once there are any cores available (like more than 10 cores), start it immediately instead of waiting for 30 cores available
  2. The MPI job will consume a lot memory and CPU resource, so I want each process only runs exactly on one core
  3. The MPI job is a big job, it needs lots of computing resources, so I want to utilize more cores available to finish the job as soon as possible

As far as I know MPI_Comm_spawn has the ability to create the process on the fly, but there are problems:

  1. How MPI master process knows new cores are available and how many available?
  2. When master process spawns new child processes, how can it effectively manage and distribute jobs to all created child processes (previous created and now created)? When use MPI_Comm_spawn to create new child processes, they have different communicators. Sending and receiving messages need communicators, with lots of new processes and communicators, managing them would be a big challenge.

Can anyone please help on this? Thank you very much.

Ben

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
9,015 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 33,071 Reputation points Volunteer Moderator
    2025-03-20T11:56:34.5433333+00:00

    Hello Ben !

    Thank you for posting in Microsoft Learn.

    In your case, I think you need a mechanism to detect available cores and spawn additional processes accordingly. This is a challenging problem because standard MPI job schedulers (for example Microsoft HPC Job Scheduler, SLURM, OpenPBS...) allocate resources at job submission, and dynamically increasing cores during execution is not natively supported.

    Your MPI master process needs to periodically check how many cores are available in the system. You can do this by:

    • Querying the HPC Job Scheduler (job list /all /format:list in Windows)
    • Using system commands like wmic cpu get NumberOfCores
    • Running a Python script inside the job to check available resources (psutil module in Python)

    For example, in Python:

    import psutil
    
    def get_available_cores():
    
        return psutil.cpu_count(logical=False)  # Returns the number of physical cores
    
    

    or in PowerShell:

    Get-WMIObject Win32_Processor | Select-Object -ExpandProperty NumberOfCores
    

    Once the master detects additional available cores, it can spawn new child processes dynamically using MPI_Comm_spawn. Example in C:

    #include <mpi.h>
    
    #include <stdio.h>
    
    #include <stdlib.h>
    
    int main(int argc, char *argv[]) {
    
        MPI_Init(&argc, &argv);
    
        int world_rank, world_size;
    
        MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    
        MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    
        if (world_rank == 0) {
    
            // Master Process: Check available cores and spawn new workers
    
            int additional_cores = 10; // Dynamically fetch available cores
    
            MPI_Comm intercomm;
    
            char *worker_program = "worker.exe";
    
            MPI_Comm_spawn(worker_program, MPI_ARGV_NULL, additional_cores, 
    
                           MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE);
    
        }
    
        printf("Process %d out of %d\n", world_rank, world_size);
    
        MPI_Finalize();
    
        return 0;
    
    }
    
    

    One issue with MPI_Comm_spawn is that newly spawned processes belong to different communicators. To solve this:

    • Use Inter-communicators (MPI_Intercomm_merge) to merge spawned processes into MPI_COMM_WORLD.
    • Use MPI Publish-Subscribe (MPI_Open_port, MPI_Comm_connect) for communication.

    Example:

    MPI_Comm merged_comm;
    
    MPI_Intercomm_merge(intercomm, 1, &merged_comm);
    
    
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.