
Ada: Execution of an OpenMP/multi-thread parallel code in batch
The jobs are managed on all the nodes by the software LoadLeveler. They are distributed into classes
principally in function of the Elapsed time, the number of cores, and the memory requested.
You can consult the structure of classes on Ada.
To submit a multi-thread job in batch, it is necessary to do the following:
- Create a script submission. Here is an example recorded in the file
openmp.ll
:
$ more openmp.ll # Arbitrary name of the LoadLeveler job # @ job_name = OpenMP # Standard output file of the job # @ output = $(job_name).$(jobid) # Output error file of the job # @ error = $(job_name).$(jobid) # Type of job # @ job_type = serial # Number of threads requested (here 4) # @ parallel_threads = 4 # Maximum Elapsed time in hh:mm:ss (here 1h30mn) # @ wall_clock_limit = 1:30:00 # @ queue # To have the command echoes set -x # Temporary job directory cd $TMPDIR # The variable LOADL_STEP_INITDIR is automatically set by # LoadLeveler to the directory in which you type the command llsubmit cp $LOADL_STEP_INITDIR/a.out . # The maximum STACK memory (4MB default) used (16 MB here) by # the private variables of each thread. export KMP_STACKSIZE=16m # It is also possible to use OMP_STACKSIZE #export OMP_STACKSIZE=16M # Execution ./a.out
- Submit this script (only from Ada) via the command
llsubmit
:
$ llsubmit openmp.ll
Remarks:
- In this example, we are supposing that the executable file
a.out
is found in the submission directory, that is, the directory from which we enter the commandllsubmit
(theLOADL_STEP_INITDIR
is automatically referenced by LoadLeveler).
- The computation output file of the
OpenMP.numero_job
is also found in the submission directory. It is created at the beginning of the job execution; editing or modifying this file during job execution can disrupt it.
- The keyword
parallel_threads
indicates the number of reserved cores: one thread per core. The denominationparallel_threads
takes on all its meaning in the execution of hybrid programs (MPI+OpenMP/Pthreads) for which it is necessary to indicate both the number of MPI processes and the number of threads for each MPI process.
- The default memory reserved per core (or per thread) is 3.5 GB. At the maximum, this value can be set to 7 GB. The keyword #@as_limit specifies a per process limit. If the process generates 4 threads and needs 7 GB per thread, the memory limit must be set to 4 * 7gb :
# @ as_limit = 28gb
.
- The private OpenMP variables are stocked in STACK memory associated with each thread. Each of them is limited by default to 4MB. In order to go beyond this limit and, for example, go up to 16MB per thread, it is necessary to use the environment variable
KMP_STACKSIZE
(=16m) orOMP_STACKSIZE
(=16M). Note that the value ofOMP_STACKSIZE
is automatically set at the same value asKMP_STACKSIZE
when the latter has already been positioned.
- The keyword
# @ environment
allows the setting of environment variables for LoadLeveler. However, this keyword should not be used to set certain variables of OpenMP or of multi-threading (such asOMP_NUM_THREADS
) because these are automatically determined and set by LoadLeveler at the beginning of the job execution.