
Table des matières
Jean Zay: GPU Slurm partitions
The partitions available
All the DARI or Dynamic Access projects having GPU hours have Slurm partitions defined on Jean Zay available to them.
Since Dec. 8, 2020, projects with V100 GPU hours have access by default to a new partition which permits using all types of four-GPU accelerated nodes with 160 GB of memory (which corresponds to combining the old gpu_p1
and gpu_p3
partitions). The execution time by default is 10 minutes and it cannot exceed 100 hours (--time=HH:MM:SS
≤ 100:00:00; see below).
This new partition including both the Nvidia V100 GPUs with 16 GB of memory and the Nvidia V100 GPUs with 32 GB of memory, if you wish to be limited to only one type of GPU, you have to specify this by adding one of the following SLURM directives to your scripts:
#SBATCH -C v100-16g
# to select nodes having GPUs with 16 GB of memory (i.e.gpu_p3
)#SBATCH -C v100-32g
# to select nodes having GPUs with 32 GB of memory (i.e.gpu_p1
)
If previously you explicitly specified one of the gpu_p1
or gpu_p3
partitions in your submission scripts, you have to replace the corresponding SLURM directive #SBATCH --partition=...
with one of the two above.
Important note:& If your job can run on both types of GPUs, we recommend not to specify any constraints (neither -C v100-16g
nor -C v100-32g
) as it will reduce the waiting time of your jobs before resources are available for the execution.
The other partitions are still available (and will remain unchanged unless otherwise advised):
- The gpu_p2 partition is currently only accessible to Artificial Intelligence researchers who have requested V100 GPU hours via Dynamic Access (AD project). This partition allows launching jobs on the eight-GPU accelerated nodes of Jean Zay. These nodes are equipped with Nvidia V100 GPUs with 32 GB of memory. The execution time by default is 10 minutes and it cannot exceed 100 hours (
--time=HH:MM:SS
≤ 100:00:00; see below). - The gpu_p4 partition is accessible to all researchers who have requested V100 GPU hours via Dynamic Access (AD) or Regular Access (DARI projects). It allows calculations to be launched on the 3 octo-GPU accelerated nodes of Jean Zay which are equipped with Nvidia A100 GPUs connected by a PCI-Express interconnection and having 40 GB of memory per GPU. By default, the execution time is 2 hours and it cannot exceed 20 hours (i.e.
--time=HH:MM:SS
≤ 20:00:00 see below).
WARNING: With this gpu_p4 partition, you can only use PyTorch and TensorFlow environments from pytorch-1.8.1 and tensorflow-2.5.0 versions because they are compatible with both Nvidia V100 GPU and Nvidia A100 GPU architectures. Furthermore, this partition having only 3 nodes (48 physical cores and 8 GPU per node), it is considered as a test and validation partition of the A100 technology at first. Please do not use it for heavy production runs. - The gpu_p5 partition is accessible only to all researchers who have requested A100 GPU hours via Dynamic Access (AD) or Regular Access (DARI projects). It allows calculations to be launched on the 52 octo-GPU accelerated nodes of Jean Zay which are equipped with Nvidia A100 GPUs connected by a SXM4 interconnection and having 80 GB of memory per GPU. The execution time by default is 10 minutes and it cannot exceed 20 hours (
--time=HH:MM:SS
≤ 20:00:00; see below; note that this means the “qos_gpu-t4” QoS cannot not be used with this partition). To use this partition, you must specify the SLURM directive#SBATCH -C a100
in your scripts.
Warning: These nodes include EPYC 7543 Milan AMD processors (64 cores per node) unlike other nodes which feature Intel processors. You must therefore loadcpuarch/amd
module (module load cpuarch/amd
) first to have access to modules compatible with this partition and to recompile your codes. - The prepost partition allows launching a job on one of the Jean Zay pre-/post-processing nodes,
jean-zay-pp
: These calculations are not deducted from your allocation. The execution time by default is 2 hours and it cannot exceed 20 hours (--time=HH:MM:SS
≤ 20:00:0, see below). - The visu partition allows launching a job on one of the Jean Zay visualization nodes,
jean-zay-visu
: These calculations are not deducted from your allocation. The execution time by default is 10 minutes and it cannot exceed 4 hours (–time=HH:MM:SS ≤ 4:00:00, see below). - The archive partition is dedicated to data management (copying or moving files, creating archive files): Corresponding hours are not deducted from your allocation. The execution time by default is 2 hours and it cannot exceed 20 hours (
--time=HH:MM:SS
≤ 20:00:00, see below). - The compil partition is dedicated to library and binary compilations which cannot be done on front end because they require too much CPU time: Corresponding hours are not deducted from your allocation. The execution time by default is 2 hours and it cannot exceed 20 hours (
--time=HH:MM:SS
≤ 20:00:00, see below).
Summary table about accessing GPU compute partitions | ||
Node type desired | Corresponding Slurm option | |
---|---|---|
CPU | GPU | |
40 CPUs + usable RAM 160 GB | 4 V100 GPUs + RAM 16 or 32 GB | default (no option) |
40 CPUs + usable RAM 160 GB | 4 V100 GPUs + RAM 16 GB | -C v100-16g |
40 CPUs + usable RAM 160 GB | 4 V100 GPUs + RAM 32 GB | -C v100-32g |
24 CPUs + usable RAM 360 or 720 GB | 8 V100 GPUs + RAM 32 GB | --partition=gpu_p2 |
24 CPUs + usable RAM 360 GB | 8 V100 GPUs + RAM 32 GB | --partition=gpu_p2s |
24 CPUs + usable RAM 720 GB | 8 V100 GPUs + RAM 32 GB | --partition=gpu_p2l |
48 CPUs + usable RAM 720 GB | 8 A100 GPUs + RAM 40 GB | --partition=gpu_p4 |
64 CPUs + usable RAM 468 GB | 8 A100 GPUs + RAM 80 GB | -C a100 |
Important: Be careful about the partition default time limits which are intentionally low. For a long execution, you should specify a time limit for the execution which must stay inferior to the maximum time authorised for the partition and the Quality of Service (QoS) used. To specify the time limits you must use either:
- The Slurm directive
#SBATCH --time=HH:MM:SS
in your job, or - The option
--time=HH:MM:SS
of thesbatch
,salloc
orsrun
commands.
The default GPU partition does not need to be requested to be used by all jobs requiring GPU. The other partitions, however, must be explicitly specified to be used. For example, to specify the prepost partition, you can use either:
- The Slurm directive
#SBATCH --partition=prepost
in your job, or - The option
--partition=prepost
of thesbatch
,salloc
orsrun
commands.
Warning: Since October 11, 2019, any job requiring more than one node runs in exclusive mode: The nodes are not shared. As a result, the full nodes are invoiced, even if some of them are only partially used for the computation.
For example, the reservation of 41 CPU cores (or 1 node + 1 core) on the cpu_p1 partition results in the invoicing of 80 CPU cores (or 2 nodes). In the same way, reserving 5 GPUs (or 1 four-GPU node + 1 GPU) on the default GPU partition results in the invoicing of 8 GPUs (or 2 four-GPU nodes). However, the total memory of the reserved nodes is available in both cases (on the order of 160 usable GBs per node).
Available QoS
For each job submitted on a compute partition (other than archive, compil, prepost and visu), you may specify a Quality of Service (QoS). The QoS determines the time/node limits, and priority of your job.
- The default QoS for all the GPU jobs: qos_gpu-t3
- Maximum duration: 20h00 of elapsed time
- 512 GPU maximum per job
- 1024 GPU maximum per user (all projects combined)
- 1024 GPU maximum per project (all users combined)
- A QoS for longer executions, only available on V100 partitions, and which must be specified to be used (see below): qos_gpu-t4
- Maximum duration: 100h00 of elapsed time
- 16 GPU maximum per job
- 128 GPU maximum per user (all projects combined)
- 128 GPU maximum per project (all users combined)
- 512 GPU maximum for the totality of jobs requesting this QoS.
- a QoS reserved only for short executions carried out within the frameworks of code development or execution tests and which must be specified to be used (see below): qos_gpu-dev
- A maximum of 10 jobs (running or pending) simultaneously per user
- Maximum duration: 2h00 of elapsed time
- 32 GPU maximum per user (all projects combined)
- 32 GPU maximum per project (all users combined)
- 512 GPU maximum for the totality of jobs requesting this QoS.
To specify a QoS which is different from the default one, you can either:
- Use the Slurm directive
#SBATCH --qos=qos_gpu-dev
(for example) in your job, or - Specify the
--qos=qos_gpu-dev
option of thesbatch
,salloc
orsrun
commands.
Summary table about GPU QoS limits | |||||
QoS | Elapsed time limit | Resource limit | |||
---|---|---|---|---|---|
per job | per user (all projects combined) | per project (all users combined) | per QoS | ||
qos_gpu-t3 (default) | 20h | 512 GPU | 1024 GPU | 1024 GPU | |
qos_gpu-t4 (V100) | 100h | 16 GPU | 128 GPU | 128 GPU | 512 GPU |
qos_gpu-dev | 2h | 32 GPU | 32 GPU max of 10 jobs (running or pending) simultaneously | 32 GPU | 512 GPU |