Changes and impacts related to the Jean Zay H100 extension

July 1, 2024

IDRIS

Computing center

⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

This post provides an overview of the ongoing operations for the commissioning of the Jean Zay H100 extension. The information provided here will evolve over time, and we invite you to check back regularly.

The extension includes new login nodes and compute nodes, each with:

2 Intel Xeon Platinum 8468 processors (48 cores at 2.10 GHz), i.e. 96 CPU cores
4 Nvidia H100 SXM5 80 GB GPUs
512 GB of memory

This extension also includes new, larger and faster disk spaces using a Lustre file system. Since mid-August, they have replaced the old disk spaces that used an IBM Spectrum Scale file system. Data stored on the old file system (Spectrum Scale) has been copied/moved to the new one (Lustre) by IDRIS, for the HOME, WORK, ALL_CCFRWORK, STORE and ALL_CCFRSTORE disk spaces. However, copying the temporary SCRATCH and ALL_CCFRSCRATCH spaces was the responsibility of each user.

Note that the storage volume (in bytes) of the WORK spaces has been increased on this occasion.

Important changes since 1 October 2024

Change of QoS names for the A100 partition

To better manage resource sharing on the machine, specific QoS have been defined for the A100 partition. If you were explicitly using the QoS “qos_gpu-t3” or “qos_gpu-dev” in your job submissions targeting this partition, you will need to use “qos_gpu_a100-t3” or “qos_gpu_a100-dev” instead. The QoS “qos_gpu_a100-t3” is used by default and can be omitted.

The CPU and V100 partitions are not affected by this change.

The documentation has been updated accordingly: http://www.idris.fr/jean-zay/gpu/jean-zay-gpu-exec_partition_slurm.html#les_qos_disponibles.

Usage of QoS via JupyterHub

If you wish to specify a QoS when using the Slurm launcher on JupyterHub, you will now need to specify it manually in the “Extra #SBATCH directives” field.

Change of JupyterHub IP address

The IP address of our JupyterHub instance has been modified. It is now 130.84.132.56. This change may affect you if your organisation applies IP address filtering to outgoing connections. If you encounter difficulties connecting to JupyterHub, we suggest contacting your IT service and informing them of this change.

As a reminder, the range of IP addresses used for IDRIS machines and services is as follows: 130.84.132.0/23. We recommend allowing the entire range rather than specific IP addresses to avoid being affected by future internal changes to our infrastructure.

Opening of the H100 partition

Users who have already obtained H100 hours can now use them. You can refer to the example below:

#!/bin/bash
#SBATCH --job-name=mon_travail       # nom du job
#SBATCH -A xyz@h100                  # comptabilite a utiliser, avec xyz le trigramme de votre projet
#SBATCH -C h100                      # pour cibler les noeuds H100
# Ici, reservation de 3x24=72 CPU (pour 3 taches) et de 3 GPU (1 GPU par tache) sur un seul noeud :
#SBATCH --nodes=1                    # nombre de noeud
#SBATCH --ntasks-per-node=3          # nombre de tache MPI par noeud (= ici nombre de GPU par noeud)
#SBATCH --gres=gpu:3                 # nombre de GPU par noeud (max 4 pour les noeuds H100)
# Sachant qu'ici on ne reserve qu'un seul GPU par tache (soit 1/4 des GPUs),
# l'ideal est de reserver 1/4 des CPU du noeud pour chaque tache:
#SBATCH --cpus-per-task=24           # nombre de CPU par tache (1/4 des CPUs ici)
# /!\ Attention, "multithread" fait reference a l'hyperthreading dans la terminologie Slurm
#SBATCH --hint=nomultithread         # hyperthreading desactive

To use the modules compatible with this H100 partition

module purge
module load arch/h100
...

Note that the default modules are not compatible with the H100 partition. To find the software environment specific to this partition, you must load the “arch/h100” module: http://www.idris.fr/jean-zay/cpu/jean-zay-cpu-doc_module.html#modules_compatibles_avec_la_partition_gpu_p6. This must be done in your submission scripts as well as in your terminal if you need to compile codes.

If you do not yet have H100 hours, the project manager can make a request on the eDARI portal if necessary.

Modification of STORE access terms

Since Monday 22 July 2024, the terms of access to the STORE have been modified. Thus, read and write access to the STORE is no longer possible from the compute nodes but remains possible from the login nodes and the nodes of the “prepost”, “visu”, “compil” and “archive” partitions. We invite you to modify your jobs if you access the STORE space directly from the compute nodes. To guide you, examples have been added to the end of our documentation on multi-step jobs.

This change is due to the fact that the volume of the STORE cache (on rotating disks) will be reduced in favour of the volume of the WORK space. We will no longer be able to guarantee the redundancy of STORE data on both magnetic tapes and rotating disks, as was previously the case. The presence of data on rotating disks allows relatively fast read/write access. With the reduction in cache volume, in some cases, data may only be stored on magnetic tape (with two copies on different tapes to ensure data redundancy), which would significantly degrade data access times and consequently the performance of your computations in case of direct access to the STORE.

As a reminder, the STORE is a space dedicated to the long-term storage of archived data.

HOME, WORK and STORE copies

IDRIS has taken care of copying the data stored in the HOME, WORK, ALL_CCFRWORK, STORE and ALL_CCFRSTORE spaces.

ATTENTION: we have only made simple copies, which may cause malfunctions in your executions, especially if you use symbolic links (such as those we recommend for personal Python environments, for example). Indeed, these are no longer valid because the paths of the new directories are different from the old ones.

Special case of HOME

The migration of HOME spaces was completed during the maintenance on 30 July 2024. We invite you to check your scripts to correct any hard-coded paths. Any path of the form /gpfs7kw/linkhome/... should become /linkhome/... or, if possible, be replaced by the use of the environment variable $HOME. If you use symbolic links such as those we recommend for personal Python environments, please recreate them to take into account the new access paths to the new directories.

Special case of WORK

The migration of WORK spaces was completed on 13 August 2024. The QoS “qos_cpu-t4” and “qos_gpu-t4” allowing the execution of jobs longer than 20 hours are now functional again.

Remark: The absolute path of the WORK spaces has changed with the migration (see the new value of the variable $WORK) but to simplify the transition, links have been put in place so that the old absolute paths remain functional at least for the time being. Now that the migration is complete, we invite you to modify any paths of the form /gpfswork/... or /gpfsdswork/projects/... that may appear in your scripts (if possible by replacing them with the use of the environment variable $WORK) or in your symbolic links to no longer use the old directories.

Special case of STORE

Regarding the STORE, the migration was finalised on 25 July 2024, the usual variable $STORE now references the new Lustre file system. A variable $OLDSTORE has been created to reference the space on the old Spectrum Scale file system.

# référence le STORE sur le filesystem Lustre
$ echo $STORE
/lustre/fsstor/projects/rech/...

# référence l'ancien STORE en lecture seule sur le filesystem Spectrum Scale
$ echo $OLDSTORE
/gpfsstore/rech/...

ATTENTION: read-only access to the old STORE space via the environment variable $OLDSTORE will be removed at the end of November 2024. The variable $OLDSTORE will then no longer be defined. Note that, read and write access to the STORE is not possible from the compute nodes but only from the login nodes and the nodes of the “prepost”, “visu”, “compil” and “archive” partitions. We invite you to modify your jobs if you access the OLDSTORE space directly from the compute nodes. To guide you, examples of job chains have been added to the end of our documentation on multi-step jobs.

SCRATCH and ALL_CCFRSCRATCH

Since Tuesday 3 September 2024, the environment variable SCRATCH (and its variants such as ALL_CCFRSCRATCH) points to the new Lustre SCRATCH space. The old Spectrum Scale SCRATCH space is no longer accessible.

The environment variable NEWSCRATCH (and its variants such as ALL_CCFRNEWSCRATCH) will eventually disappear, and we therefore encourage you to replace it with SCRATCH as soon as possible.

# référence le nouveau SCRATCH sur le filesystem Lustre
$ echo $SCRATCH
/lustre/fsn1/projects/rech/...

Change of symbolic links

Attention, if you have directories in your HOME directory such as $HOME/.local, $HOME/.conda pointing via symbolic links to the old WORK of the form /gpfswork/..., you need to change these links to point to the new WORK of the form /lustre/fswork/....

$ cd $HOME
# Ici .local est un lien sur l'ancien WORK
$ ls -al .local
 ...  .local -> /gpfswork/...
# Supprime le lien
$ unlink .local
# Recree le lien avec la variable $WORK qui reference le nouveau WORK
$ ln -s $WORK/.local $HOME
# Lien sur le nouveau WORK
$ ls -al .local
 ...  .local -> /lustre/fswork/...

Important changes since 1 October 2024​

Change of QoS names for the A100 partition​

Usage of QoS via JupyterHub​

Change of JupyterHub IP address​

Opening of the H100 partition​

To use the modules compatible with this H100 partition

Modification of STORE access terms​

HOME, WORK and STORE copies​

Special case of HOME​

Special case of WORK​

Special case of STORE​

SCRATCH and ALL_CCFRSCRATCH​

Change of symbolic links​