Turing:  Execution of a hybrid MPI/OpenMP parallel code in batch

Hybrid MPI/OpenMP and MPI/Pthreads applications may run on the Blue Gene/Q.  The choice of the maximum number of processes per compute node also determines the maximum number of threads per MPI process.

Following is an example of a job script for executing a code on 1024 cores with 512 MPI processes and 4 OpenMP threads per MPI process, which gives a total of 2048 threads with 2 threads for each physical machine core and 8 MPI processes per compute node.  The submission is made, in supposing that it is called job.ll, via the command:

    11submit job.ll

The submission file contains the following lines:

    # @ job_name = job_hybrid
    # @ job_type = BLUEGENE
    # Job standard output file
    # @ output =  $(job_name).$(jobid)
    # Job standard error file
    # @ error = $(output)
    # Maximum elapsed time request
    # @ wall_clock_limit = 1:00:00
    # Execution block size
    # @ bg_size = 64
    # @ queue

runjob --ranks-per-node 8  --envs ''OMP_NUM_THREADS=4''  --np512 : ./my_code_mpi_omp  my_arg1 my_arg2

The environmental variable OMP_NUM_THREADS indicates the number of OpenMP threads for each MPI process chosen.  The number of threads cannot be more than the maximum number of available hardware threads (that is, 64 hardware threads per compute node).  It is not mandatory to choose the number of threads and, in case of omission, the value is set at the maximum number of threads possible.

As a reminder, it is strongly advised to create multistep jobs to process the sequential parts of your job (pre- or post-processing and file transfers).

Attention: by default, the memory for the stack of an OpenMP thread is limited to 4 MiB. To modify it, use the OMP_STACKSIZE environment variable (for example: runjob –envs OMP_STACK_SIZE=16M … to reserve 16 MiB per thread).

You can find more information about how to submit batch jobs in the following sections: execution of a parallel MPI code in batch and principal parameters of the runjob command.