Adapp : Execution of an OpenMP/multithread parallel code in batch

Jobs on all of the nodes are managed by the LoadLeveler software. They are distributed in the classes principally in function of the Elapsed time, the number of cores and the memory requested. You can consult the overall class structure on Ada and Adapp.

To submit a multithread job in batch from Adapp, you must:

  • Create a submission script. Here is an example stocked in the file openmp.ll :

    openmp.ll
    # Arbitrary name for the LoadLeveler job
    # @ job_name = OpenMP
    # Standard output file for the job
    # @ output   = $(job_name).$(jobid)
    # Error output file for the job
    # @ error    = $(job_name).$(jobid)
    # Job type
    # @ job_type = serial
    # Number of threads requested (here 4)
    # @ parallel_threads = 4
    # Specific to Adapp
    # @ requirements = (Feature == "prepost")
    # Max. Elapsed Time, hh:mm:ss (here 30mn)
    # @ wall_clock_limit = 00:30:00
    # @ queue
     
    # To print an echo of each command in the output file
    set -x
    # Temporary directory for execution
    cd $TMPDIR
    # The LOADL_STEP_INITDIR variable is automatically set by
    # LoadLeveler, its value is the directory where the llsubmit command was typed
    cp $LOADL_STEP_INITDIR/a.out .
    # Max. STACK memory (default 4MB) requested (here 16 MB) for the private
    # variables of each thread
    export KMP_STACKSIZE=16m
    # OMP_STACKSIZE is also suitable for the same purposes
    #export OMP_STACKSIZE=16m
    # Execution
    ./a.out
  • Submit this script via la command llsubmit :

    $ llsubmit openmp.ll

Comments :

  • Do not forget the keyword # @ requirements = (Feature == “prepost”); otherwise, your calculation will be carried out on the “normal” Ada compute nodes.
  • In this example, let us suppose that the executable file a.out is found in the submission directory which is the directory in which we enter the command llsubmit (the variable LOADL_STEP_INITDIR is automatically set by LoadLeveler).
  • The output file of the Mpi.numero_job computation is also found in the submission directory.
  • It is created as soon as the job execution begins; editing or modifying it while the job is running can disrupt the execution.
  • The keyword parallel_threads indicates the number of reserved cores: one thread per core.
  • The default memory reserved per core (or per thread) is 3.5 GB. At the maximum, this value can be set to 30 GB. The keyword #@as_limit specifies a per process limit. If the process generates 4 threads and needs 30 GB per thread, the memory limit must be set to 4 * 30.0gb : # @ as_limit = 120.0gb.
  • The private OpenMP variables are stocked in STACK zones associated with each thread. Each STACK zone is default limited to 4MB. To go beyond this limit and to go, for example, up to 16MB per thread, it is necessary to use the environment variable KMP_STACKSIZE (=16m) or OMP_STACKSIZE (=16m). Note that the value of OMP_STACKSIZE is automatically adjusted to the value of KMP_STACKSIZE when this one has been positioned.
  • The keyword # @ environment allows defining the environment variables on LoadLeveler. Do not use this keyword for certain variables specific to OpenMP or multithreading (such as OMP_NUM_THREADS) because these are defined automatically and set by LoadLeveler at the beginning of the job execution.