Jean Zay: GPU Slurm partitions

The partitions available

All the DARI or Dynamic Access projects having GPU hours have Slurm partitions defined on Jean Zay available to them.

Since Dec. 8, 2020, projects with V100 GPU hours have access by default to a new partition which permits using all types of four-GPU accelerated nodes with 160 GB of memory (which corresponds to combining the old gpu_p1 and gpu_p3 partitions). The execution time by default is 10 minutes and it cannot exceed 100 hours (--time=HH:MM:SS ≤ 100:00:00; see below).

This new partition including both the Nvidia V100 GPUs with 16 GB of memory and the Nvidia V100 GPUs with 32 GB of memory, if you wish to be limited to only one type of GPU, you have to specify this by adding one of the following SLURM directives to your scripts:

If previously you explicitly specified one of the gpu_p1 or gpu_p3 partitions in your submission scripts, you have to replace the corresponding SLURM directive #SBATCH --partition=... with one of the two above.

Important note:& If your job can run on both types of GPUs, we recommend not to specify any constraints (neither -C v100-16g nor -C v100-32g) as it will reduce the waiting time of your jobs before resources are available for the execution.

Other partitions are available:

  • The gpu_p2 partition is accessible to all researchers. This partition allows launching jobs on the eight-GPU accelerated nodes of Jean Zay. These nodes are equipped with Nvidia V100 GPUs with 32 GB of memory. The execution time by default is 10 minutes and it cannot exceed 100 hours (--time=HH:MM:SS ≤ 100:00:00; see below).
    • The gpu_p2s subpartition gives access to the eight-GPU nodes with 360 GB of memory.
    • The gpu_p2l subpartition gives access to the eight-GPU nodes with 720 GB of memory.
  • The gpu_p5 partition is accessible only to all researchers who have requested A100 GPU hours via Dynamic Access (AD) or Regular Access (DARI projects). It allows calculations to be launched on the 52 octo-GPU accelerated nodes of Jean Zay which are equipped with Nvidia A100 GPUs connected by a SXM4 interconnection and having 80 GB of memory per GPU. The execution time by default is 10 minutes and it cannot exceed 20 hours (--time=HH:MM:SS ≤ 20:00:00; see below; note that this means the “​qos_gpu-t4”​ QoS cannot not be used with this partition). To use this partition, you must specify the SLURM directive #SBATCH -C a100 in your scripts.
    Warning: These nodes include EPYC 7543 Milan AMD processors (64 cores per node) unlike other nodes which feature Intel processors. You must therefore load cpuarch/amd module (module load cpuarch/amd) first to have access to modules compatible with this partition and to recompile your codes.
  • The prepost partition allows launching a job on one of the Jean Zay pre-/post-processing nodes, jean-zay-pp: These calculations are not deducted from your allocation. The execution time by default is 2 hours and it cannot exceed 20 hours (--time=HH:MM:SS ≤ 20:00:0, see below).
  • The visu partition allows launching a job on one of the Jean Zay visualization nodes, jean-zay-visu: These calculations are not deducted from your allocation. The execution time by default is 10 minutes and it cannot exceed 4 hours (–time=HH:MM:SS ≤ 4:00:00, see below).
  • The archive partition is dedicated to data management (copying or moving files, creating archive files): Corresponding hours are not deducted from your allocation. The execution time by default is 2 hours and it cannot exceed 20 hours (--time=HH:MM:SS ≤ 20:00:00, see below).
  • The compil partition is dedicated to library and binary compilations which cannot be done on front end because they require too much CPU time: Corresponding hours are not deducted from your allocation. The execution time by default is 2 hours and it cannot exceed 20 hours (--time=HH:MM:SS ≤ 20:00:00, see below).

Summary table about accessing GPU compute partitions
Node type desired Corresponding Slurm option
CPU GPU
40 CPUs + usable RAM 160 GB 4 V100 GPUs + RAM 16 or 32 GB default (no option)
40 CPUs + usable RAM 160 GB 4 V100 GPUs + RAM 16 GB -C v100-16g
40 CPUs + usable RAM 160 GB 4 V100 GPUs + RAM 32 GB -C v100-32g
24 CPUs + usable RAM 360 or 720 GB 8 V100 GPUs + RAM 32 GB --partition=gpu_p2
24 CPUs + usable RAM 360 GB 8 V100 GPUs + RAM 32 GB --partition=gpu_p2s
24 CPUs + usable RAM 720 GB 8 V100 GPUs + RAM 32 GB --partition=gpu_p2l
64 CPUs + usable RAM 468 GB 8 A100 GPUs + RAM 80 GB -C a100

Important: Be careful about the partition default time limits which are intentionally low. For a long execution, you should specify a time limit for the execution which must stay inferior to the maximum time authorised for the partition and the Quality of Service (QoS) used. To specify the time limits you must use either:

  • The Slurm directive #SBATCH --time=HH:MM:SS in your job, or
  • The option --time=HH:MM:SS of the sbatch, salloc or srun commands.

The default GPU partition does not need to be requested to be used by all jobs requiring GPU. The other partitions, however, must be explicitly specified to be used. For example, to specify the prepost partition, you can use either:

  • The Slurm directive #SBATCH --partition=prepost in your job, or
  • The option --partition=prepost of the sbatch, salloc or srun commands.

Warning: Since October 11, 2019, any job requiring more than one node runs in exclusive mode: The nodes are not shared. As a result, the full nodes are invoiced, even if some of them are only partially used for the computation.
For example, the reservation of 41 CPU cores (or 1 node + 1 core) on the cpu_p1 partition results in the invoicing of 80 CPU cores (or 2 nodes). In the same way, reserving 5 GPUs (or 1 four-GPU node + 1 GPU) on the default GPU partition results in the invoicing of 8 GPUs (or 2 four-GPU nodes). However, the total memory of the reserved nodes is available in both cases (on the order of 160 usable GBs per node).

Available QoS

For each job submitted on a compute partition (other than archive, compil, prepost and visu), you may specify a Quality of Service (QoS). The QoS determines the time/node limits, and priority of your job.

  • The default QoS for all the GPU jobs: qos_gpu-t3
    • Maximum duration: 20h00 of elapsed time
    • 512 GPU maximum per job
    • 512 GPU maximum per user (all projects combined)
    • 512 GPU maximum per project (all users combined)
  • A QoS for longer executions, only available on V100 partitions, and which must be specified to be used (see below): qos_gpu-t4
    • Maximum duration: 100h00 of elapsed time
    • 16 GPU maximum per job
    • 96 GPU maximum per user (all projects combined)
    • 96 GPU maximum per project (all users combined)
    • 256 GPU maximum for the totality of jobs requesting this QoS.
  • a QoS reserved only for short executions carried out within the frameworks of code development or execution tests and which must be specified to be used (see below): qos_gpu-dev
    • A maximum of 10 jobs (running or pending) simultaneously per user
    • Maximum duration: 2h00 of elapsed time
    • 32 GPU maximum per job
    • 32 GPU maximum per user (all projects combined)
    • 32 GPU maximum per project (all users combined)
    • 512 GPU maximum for the totality of jobs requesting this QoS.

To specify a QoS which is different from the default one, you can either:

  • Use the Slurm directive #SBATCH --qos=qos_gpu-dev (for example) in your job, or
  • Specify the --qos=qos_gpu-dev option of the sbatch, salloc or srun commands.

Summary table about GPU QoS limits
QoS Elapsed time limit Resource limit
per job per user (all
projects combined)
per project (all
users combined)
per QoS
qos_gpu-t3 (default) 20h 512 GPU 512 GPU 512 GPU
qos_gpu-t4 (V100) 100h 16 GPU 96 GPU 96 GPU 256 GPU
qos_gpu-dev 2h 32 GPU 32 GPU
max of 10 jobs
(running or pending)
simultaneously
32 GPU 512 GPU