
Table des matières
Jean Zay: HPE SGI 8600 supercomputer
- copyright Photothèque CNRS/Cyril Frésillon
Jean Zay is an HPE SGI 8600 computer composed of two partitions: a partition containing scalar nodes, and a partition containing accelerated nodes which are hybrid nodes equipped with both CPUs and GPUs. All the compute nodes are interconnected by an Intel Omni-PAth network (OPA) and access a parallel file system with very high bandwidth.
Since the last extension in 2022, the cumulated peak performance of Jean Zay is 36,85 Pflop/s
For more information, please refer to our documentation concerning the usage of Jean Zay resources.
Hardware description
Access to the various hardware partitions of the machine depends on the type of submitted job (CPU or GPU) and the requested Slurm partition for its execution (see the details of Slurm CPU partitions and Slurm GPU partitions).
Scalar partition (or CPU partition)
Without specifying a CPU partition or with the cpu_p1 partition, you will have access to the following resources:
- 1528 scalar compute nodes with:
- 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), namely 40 cores per node
- 192 GB of memory per node
Accelerated partition (or GPU partition)
Without indicating a GPU partition or with the v100-16g or v100-32g constraint, you will have access to the following resources:
- 612 four-GPU accelerated compute nodes with:
- 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), namely 40 cores per node
- 192 GB of memory per node
- 351 nodes with 4 Nvidia Tesla V100 SXM2 16GB GPUs (with v100-16g)
- 261 nodes with 4 Nvidia Tesla V100 SXM2 32GB GPUs (with v100-32g)
With the gpu_p2, gpu_p2s or gpu_p2l partition (partitions dedicated to the AI community), you will have access to the following resources:
- 31 eight-GPU accelerated compute nodes with:
- 2 Intel Cascade Lake 6226 processors (12 cores at 2.7 GHz), namely 24 cores per node
- 20 nodes with 384 GB of memory (with gpu_p2 or gpu_p2s)
- 11 nodes with 768 GB of memory (with gpu_p2 or gpu_p2l)
- 8 Nvidia Tesla V100 SXM2 32 GB GPUs
With the gpu_p4 partition (extension in the summer of 2021), you will have access to the following resources:
- 3 eight-GPU accelerated compute nodes with:
- 2 Intel Cascade Lake 6240R processors (24 cores at 2.4 GHz), namely 48 cores per node
- 768 GB of memory per node
- 8 Nvidia A100 PCIe 40 GB GPUs
With the gpu_p5 partition (extension in june 2022 and accessible only with A100 GPU hours), you will have access to the following resources:
- 52 eight-GPU accelerated compute nodes with:
- 2 AMD Milan EPYC 7543 processors (32 cores at 2,80 GHz), namely 64 cores per node
- 512 GB of memory per nodes
- 8 Nvidia A100 SXM4 80 GB GPUs
Pre- and post-processing
With the prepost partition, you will have access to the following resources:
- 4 pre- and post-processing large memory nodes with:
- 4 Intel Skylake 6132 processors (12 cores at 3.2 GHz), namely 48 cores per node
- 3 TB of memory per node
- 1 Nvidia Tesla V100 GPU
- A 1.5 TB internal NVMe disk
Visualization
With the visu partition, you will have access to the following resources:
- 5 scalar-type visualization nodes with:
- 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), namely 40 cores per node
- 192 GB of memory per node
- 1 Nvidia Quatro P6000 GPU
Compilation
With the compil partition, you will have access to the following resources:
- 4 pre- and post-processing large memory nodes (see above)
- 3 compilation nodes with:
- 1 Intel(R) Xeon(R) Silver 4114 processor (10 cores at 2.20 GHz), namely 10 cores per node
- 96 GB of memory per node
Archivage
With the archive partition, you will have access to the following resources:
- 4 pre- and post-processing large memory nodes (see above)
Additional characteristics
- Cumulated peak performance of 36,85 Pflop/s since the 2022 extension
- Omni-PAth interconnection network 100 Gb/s : 1 link per scalar node and 4 links per converged node
- IBM's Spectrum Scale parallel file system (ex-GPFS)
- Parallel storage device with a capacity of 2.5 PB SSD disks (GridScaler GS18K SSD) after the summer of 2020 extension.
- Parallel storage device with a capacity greater than 30 PB
- 5 frontal nodes
- 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), namely 40 cores per node
- 192 GB of memory per node
Basic Software description
Operating environment
- RedHat version 8.6 (since 22/11/2023)
- Slurm version 23.02.6 (since 24/10/2023)