Ada: Class structure - interactive and batch limits

*****************************************************************************

Structure of batch classes on Ada                               July 10, 2015

*****************************************************************************

1) Interactive limits
   ==================

     - Uniprocess -  Sequential or multithread  (OpenMP/Pthreads): 
           Total memory for all of the work: < 3.5GB
           Duration: 30 minutes (CPU time)
           Maximum number of tasks: 4 


     - Multiprocess (MPI):
           Memory per MPI process: 3.5GB
           Duration: 30 minutes (Elapsed time)
           Max. number of processes: 32 (by using the MP_PROCS variable)


2) Structure of the batch processing classes
   =========================================

   Access to the uniprocess (sequential) classes:
   ----------------------------------------------
   T(h) ^ ELAPSED
        |
   100h +---------------+-----------------|
        |       t4      |      t4L        |
    20h +---------------+-----------------|
        |       t3      |      t3L        |
    10h +---------------+-----------------|
        |       t2      |      t2L        |
     1h +---------------+-----------------| 
        |       t1      |      t1L        |
      0 +---------------+-----------------+--> Memory
                       3.5GB            20.0GB

Access to these classes is obtained via the keyword:
                         # @ job_type = serial
The memory by default is 3.5 GB. You may request up to 20 GB via
the keyword:             # @ as_limit = 5.0gb


   Access to the OpenMP or Pthreads (multithread) classes:
   -------------------------------------------------------

   T(h) ^ ELAPSED
        |
   100h +-------------+--------------+--------------|
        |   mt8t4/L   |   mt16t4/L   |   mt32t4/L   |
    20h +-------------+--------------+--------------|
        |   mt8t3/L   |   mt16t3/L   |   mt32t3/L   |
    10h +-------------+--------------+--------------|
        |   mt8t2/L   |   mt16t2/L   |   mt32t2/L   |
     1h +-------------+--------------+--------------|
        |   mt8t1/L   |   mt16t1/L   |   mt32t1/L   |
      0 +-------------+--------------+--------------+--> Number of
                      8             16             32    cores

The number of cores in a job (5 in this example) is specified by
the keywords:            # @ job_type = serial
                         # @ parallel_threads = 5
The OMP_NUM_THREADS variable is automatically set at the
parallel_threads value.

By default, the memory reserved per core is 3.5 GB. The maximum which you
can request is 7.0 GB per core ("Large" memory classes).
You can reduce the memory requested (which can decrease the queuing times of
your jobs) or increase the memory requested if necessary (without exceeding
7.0 GB per core, depending on the case) via the keyword:
                         # @ as_limit
Attention:  as_limit indicates one value PER PROCESS. However, for an
OpenMP/pthreads job, there is only one process which both generates memory and records the memory usage
of ALL the threads. 
Therefore, if your binary generates 5 threads and you need 7.0 GB per
thread, you must specify as_limit as the number of threads multiplied by
7.0 GB:
                         # @ as_limit = 35.0 GB


   Access to the MPI or hybrid classes (MPI+threads):
   --------------------------------------------------

   T(h) ^ ELAPSED
        |
   100h +--------+---------+---------+
        | c8t4/L | c16t4/L | c32t4/L | 
    20h +--------+---------+---------+---------+-- ......... --+---------+
        | c8t3/L | c16t3/L | c32t3/L | c64t3/L |   c.......t3  | c2048t3 |
    10h +--------+---------+---------+---------+-- ......... --+---------|
        | c8t2/L | c16t2/L | c32t2/L | c64t2/L |   c.......t2  | c2048t2 |
     1h +--------+---------+---------+---------+-- ......... --+---------|
        | c8t1/L | c16t1/L | c32t1/L | c64t1/L |   c.......t1  | c2048t1 |
      0 +--------+---------+---------+---------+-- ......... --+---------+-->
                 8        16        32        64,128,256,512,1024      2048
                                                       Number of cores

The number of MPI processes in a job (8 here) is specified by
the keywords:            # @ job_type = parallel
                         # @ total_tasks = 8

For a hybrid job, the number of OpenMP/pthreads per MPI 
process cannot exceed 32. It is set (here at 4) by the keyword:
                         # @ parallel_threads = 4
The number of cores requested is, therefore, total_tasks * parallel_threads.
With the above values, you would need to reserve 32 cores.

ATTENTION: Jobs that require more than 32 cores 
(total_tasks * parallel_threads > 32) run on DEDICATED nodes. In this case, the number of cores
reserved and invoiced is a multiple of 32. For example, if you request
65 cores, you will be invoiced for 96 cores. 

The default memory is 3.5 GB per core. This is the maximum available
for a job with more than 64 cores. For
64 or fewer cores, the maximum memory is 7.0 GB per core (called "Large" memory classes).
You can reduce the memory requested (which can decrease the queuing times of
your jobs) or increase the memory requested if necessary (without exceeding
3.5 GB or 7.0 GB per core, depending on the case) via the keyword:
                         # @ as_limit
Attention: as_limit indicates a value PER MPI PROCESS.
Therefore, for "pure" MPI jobs, the memory per MPI process is equal to the
memory per core. For example, to request 3.0 GB per MPI process, you should
specify:                 # @ as_limit = 3.0 GB
However, for hybrid jobs (MPI+threads), the memory per MPI task is equal to 
the memory per thread multiplied by the requested number of threads. 
For example, if your binary generates 5 threads per MPI process and you
need to have 7.0 GB per thread, you should specify as_limit as follows :
                         # @ as_limit = 35.0 GB

There are only 28 nodes on which "Large" jobs (more than 3.5 GB of
memory per core) can be run , in comparison to 304 nodes available for the usual jobs.
It is to be expected, therefore, that there is a longer waiting time for the 
"Large" batch classes.


3) General Comments
   ================

All the class time limits are expressed in Elapsed time (or Clock time); the 
keyword is:               # @ wall_clock_limit = hh:mm:ss
The time which you are invoiced is the consumed Elapsed time 
multiplied by the number of reserved cores. This Elapsed time can
vary depending on the load of the nodes (I/Os or message passing); consequently, 
you should request a larger amount of time to provide a margin of security.

INTERACTIVE parallel executions are in competition with parallel jobs. When the
requested resources are not available, the request is rejected with an error
message.

Compilation of certain codes can 
exceed the interactive Elapsed time limit. One
class (compil) is dedicated to these compilations, with a 
maximum Elapsed time of 20 hours. The keyword is:     # @ class = compil

To debug your codes, you can reduce the waiting time of your jobs by using
the keyword:             # @ class = debug
Attention:  With this keyword, you cannot request more than 900s or more than 64 
cores.

To use the pre/post-processing nodes, the keyword is:
                         # @ requirements = (Feature == "prepost")
The maximum Elapsed time is 20h. You cannot reserve more than 32 cores  
(maximum 1 compute node ). You can request up to 100 GB of memory for a sequential execution or 
30 GB per reserved core in parallel (mpi, openmp or hybrid).
Attention: The use of pre/post-processing nodes must be exclusively reserved for pre/post-processing tasks. 
Examples of this are :
- jobs effectuating many intakes/outputs (file recombinations, ...)
- jobs requiring much memory (mesh generation/partitioning, ...)

Jobs comprising between 20 and 100 hours (t4 classes) are not reimbursed if
there is a problem on the compute nodes:  We cannot guarantee the
stability of the hardware for a job duration of nearly 5 days. Jobs this long 
should have periodic restart points implemented.


4) Bonus jobs
   ==========

It is possible to run so-called "bonus jobs" by adding the following keyword
(before #@queue):
                         # @ account_no = bonus

Bonus jobs are not included in the DARI hours allocation and accounting. 
However, they will only be executed 
during machine low-load periods and via specific classes
for parallel bonus jobs. Bonus jobs are limited to a maximum of 512 cores and
a maximum elapsed time of 20 hours.
For more information about bonus jobs, please see our website : www.idris.fr/eng/ > Bonus jobs .


To see the French version, type : news class
****************************************************************************