Jean Zay: HPE SGI 8600 supercomputer

jean-zay-annonce-01.jpg

  • copyright Photothèque CNRS/Cyril Frésillon

Jean Zay is an HPE SGI 8600 computer composed of two partitions: a partition containing scalar nodes, and a partition containing accelerated nodes which are hybrid nodes equipped with both CPUs and GPUs. All the compute nodes are interconnected by an Intel Omni-PAth network (OPA) and access a parallel file system with very high bandwidth.

For more information, please refer to our documentation concerning the usage of Jean Zay resources.

Hardware description

Scalar partition (or CPU partition)

  • 1528 scalar compute nodes with:
    • 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), namely 40 cores per node
    • 192 GB of memory per node

Accelerated partition (or GPU partition)

  • 261 four-GPU accelerated compute nodes with:
    • 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), namely 40 cores per node
    • 192 GB of memory per node
    • 4 Nvidia Tesla V100 SXM2 GPUs (32 GB)
  • 31 eight-GPU accelerated compute nodes, currently dedicated to the AI community with:
    • 2 Intel Cascade Lake 6226 processors (12 cores at 2.7 GHz), namely 24 cores per node
    • 20 nodes with 384 GB of memory and 11 nodes with 768 GB of memory
    • 8 Nvidia Tesla V100 SXM2 GPUs (32 GB)
  • Extension in the summer of 2020, 351 four-GPU accelerated compute nodes with:
    • 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), namely 40 cores per node
    • 192 GB of memory per node
    • 4 Nvidia Tesla V100 SXM2 GPUs (16 GB)

Pre-/post-processing

  • 4 pre- and post-processing large memory nodes with:
    • 4 Intel Skylake 6132 processors (12 cores at 3.2 GHz), namely 48 cores per node
    • 3 TB of memory per node
    • 1 Nvidia Tesla V100 GPU
    • A 1.5 TB internal NVMe disk

Visualization

  • 5 scalar-type visualization nodes with:
    • 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), namely 40 cores per node
    • 192 GB of memory per node
    • 1 Nvidia Quatro P6000 GPU

Additional characteristics

  • Cumulated peak performance of 28 Pflop/s since the summer of 2020 extension, with a total of 2696 Nvidia V100 GPUs
  • Omni-PAth interconnection network 100 Gb/s : 1 link per scalar node and 4 links per converged node
  • IBM's Spectrum Scale parallel file system (ex-GPFS)
  • Parallel storage device with a capacity of 2.2 PB SSD disks (GridScaler GS18K SSD) after the summer of 2020 extension.
  • Parallel storage device with a capacity greater than 30 PB
  • 5 frontal nodes
    • 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), namely 40 cores per node
    • 192 GB of memory per node

Basic Software description

Operating environment

  • RedHat version 8.1 (since 06/09/2020)
  • Slurm version 20.02.6 (since 04/27/2021)

Compilers

  • Intel compilers ​ifort​ and ​icc with Intel(R) Math Kernel Library
  • PGI compilers ​pgfortran​ and ​pgcc