IDRIS: GENERAL INTRODUCTION



1. Introduction to IDRIS

The Missions and Objectives of IDRIS

IDRIS (The Institute for Development and Resources in Intensive Scientific Computing), founded in 1993, is the national centre of the CNRS for intensive numerical calculations of very high performance computing (HPC) and for Artificial Intelligence (AI). It serves the scientific research communities which rely on extreme computing, both public and private (on condition of open research with publication of results).

At the same time a centre of computing resources and a pole of competence in HPC and AI, IDRIS (www.idris.fr) is a research support unit of the CNRS under the auspices of the CNRS Open Research Data Department (DDOR). It is administratively attached to the Institute for Computing Sciences (INS2I) but has a multidisciplinary vocation within the CNRS. Its operational modalities are similar to those of very large scientific equipment.

IDRIS is currently directed by M. Pierre-François Lavallée.

Scientific management of Resources

Allocating computing hours for the three national computing centres (CINES, IDRIS and TGCC), is organized under the coordination of GENCI (Grand Équipement National de Calcul Intensif).

Requests for resources are made through eDARI portal, the common web site for the three computing centres.

Requests may be submitted for calculation hours for new projects, to renew an existing project or to request supplementary hours. These hours are valuable for one year.

Depending on the number of requested hours, the file will be considered as a regular access file (accés régukier) or a dynamique access file (accès dynamique)

  • Regular access: the request computing resources are possible at any time during the year but the evaluation of the dossier is semi-annual in May and November. The proposals are examined from a scientific perspective by the Thematic Committees members who draw on the technical expertise of the centres application assistance teams as needed. Subsequently, an evaluation committee meets to decide upon the resource requests and make approval recommendations to the Attribution Committee, under the authority of GENCI, for distributing computing hours to the three national centres.
  • Dynamique access: The requests are validated by the IDRIS director who evaluats scientific and technic quality of the proposal and eventualy ask for the advice from a scientific thematic comittee expert.

In both case, IDRIS management studies the requests for supplementary resources as needed (“demandes au fil de l'eau”) and attributes limited hours in order to avoid the blockage of on-going projects.

For more information, you can consult Requesting resource hours on IDRIS machine web page.

You also have access to a short video about resources allocation and account opening on Jean Zay on our YouTube channel "Un œil sur l'IDRIS" (it is in French but automatic subtitles work quite properly):

The IDRIS User Committee

The role of the User Committee is to be a liaison for dialogue with IDRIS so that all the projects which received allocations of computer resources can be successfully conducted in the best possible conditions. The committee transmits the observations of all the users regarding the functionning of the centre and the issues are discussed with IDRIS in order to determine the appropriate changes to be made.

The User Committee consists of 2 members elected from each scientific discipline (24 members in 2014) who can be contacted at the following address: The User Committee pages are available to IDRIS users by connecting to the IDRIS Extranet, section: Comité des utilisateurs.

In this space are found the reports on IDRIS machine exploitation as well as the latest meeting minutes.

IDRIS personnel

Organigramme IDRIS

Return to Table of Contents


2. The IDRIS machine

Jean Zay: HPE SGI 8600 supercomputer

Jean Zay is an HPE SGI 8600 computer composed of two partitions: a partition containing scalar nodes, and a partition containing accelerated nodes which are hybrid nodes equipped with both CPUs and GPUs. All the compute nodes are interconnected by an Intel Omni-PAth network (OPA) and access a parallel file system with very high bandwidth.

Following two successive extensions, the cumulated peak performance of Jean Zay reached 36.85 Pflop/s starting in June 2022.

Access to the various hardware partitions of the machine depends on the type of job submitted (CPU or GPU) and the Slurm partition requested for its execution (see the details of the Slurm CPU partitions and the Slurm GPU partitions).

Scalar partition (or CPU partition)

Without specifying a CPU partition, or with the cpu_p1 partition, you will have access to the following resources:

  • 720 scalar compute nodes with:
    • 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), or 40 cores per node
    • 192 GB of memory per node

Note: Following the decommissioning of 808 CPU nodes on 5 February 2024, this partition went from 1528 nodes to 720 nodes.

Accelerated partition (or GPU partition)

Without indicating a GPU partition, or with the v100-16g or v100-32g constraint, you will have access to the following resources:

  • 396 four-GPU accelerated compute nodes with:
    • 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), or 40 cores per node
    • 192 GB of memory per node
    • 126 nodes with 4 Nvidia Tesla V100 SXM2 16GB GPUs (with v100-16g)
    • 270 nodes with 4 Nvidia Tesla V100 SXM2 32GB GPUs (with v100-32g)

Note: Following the decommissioning of 220 4-GPU V100 16 GB nodes (v100-16g) on 5 February 2024, this partition went from 616 nodes to 396 nodes.

With the gpu_p2, gpu_p2s or gpu_p2l partitions, you will have access to the following resources:

  • 31 eight-GPU accelerated compute nodes with:
    • 2 Intel Cascade Lake 6226 processors (12 cores at 2.7 GHz), or 24 cores per node
    • 20 nodes with 384 GB of memory (with gpu_p2 or gpu_p2s)
    • 11 nodes with 768 GB of memory (with gpu_p2 or gpu_p2l)
    • 8 Nvidia Tesla V100 SXM2 32 GB GPUs

With the gpu_p5 partition (extension of June 2022 and accessible only with A100 GPU hours), you will have access to the following resources:

  • 52 eight-GPU accelerated compute nodes with:
    • 2 AMD Milan EPYC 7543 processors (32 cores at 2.80 GHz), or 64 cores per node
    • 512 GB of memory per node
    • 8 Nvidia A100 SXM4 80 GB GPUs

Pre- and post-processing

With the prepost partition, you will have access to the following resources:

  • 4 pre- and post-processing large memory nodes with:
    • 4 Intel Skylake 6132 processors (12 cores at 3.2 GHz), or 48 cores per node
    • 3 TB of memory per node
    • 1 Nvidia Tesla V100 GPU
    • An internal NVMe 1.5 TB disk

Visualization

With the visu partition, you will have access to the following resources:

  • 5 scalar-type visualization nodes with:
    • 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), or 40 cores per node
    • 192 GB of memory per node
    • 1 Nvidia Quadro P6000 GPU

Compilation

With the compil partition, you will have access to the following resources:

  • 4 pre- and post-processing large memory nodes (see above)
  • 3 compilation nodes with:
    • 1 Intel(R) Xeon(R) Silver 4114 processor (10 cores at 2.20 GHz)
    • 96 GB of memory per node

Archiving

With the archive partition, you will have access to the following resources:

  • 4 pre- and post-processing nodes (see above)

Additional characteristics

  • Cumulated peak performance of 36.85 Pflop/s (until 5 February 2024)
  • Omni-PAth interconnection network 100 Gb/s : 1 link per scalar node and 4 links per converged node
  • IBM's Spectrum Scale parallel file system (ex-GPFS)
  • Parallel storage device with a capacity of 2.5 PB SSD disks (GridScaler GS18K SSD) following the 2020 summer extension.
  • Parallel storage device with disks having more than 30 PB capacity
  • 5 frontal nodes
    • 2 Intel Cascade Lake 6248 processors (20 cores at 2.5 GHz), or 40 cores per node
    • 192 GB of memory per node

Return to Table of Contents


3. Requesting allocations of hours on IDRIS machine

Requesting computing hours at IDRIS

Requesting resource hours on Jean Zay is done via the eDARI portal www.edari.fr, common to the three national computing centres: CINES, IDRIS and TGCC.

Before requesting any hours, we recommend that you consult the GENCI document (in French) detailing the conditions and eligibility criteria for obtaining computing hours.

For both Artificial Intelligence or HPC usage, you have been able to request computing resources at any time during the year via a single form on the eDARI portal. Your file will be Dynamic Access or Regular Access, depending on the number of hours you request. If the number of hours is less than or equal to 50 000 normalized GPU hours (1 A100 hour = 2 V100 hours = 2 normalized GPU hours) / 500 000 CPU hours, it will be Dynamic Access (AD). If the amount is larger than these values, your file will be Regular Access (AR). Important : Your request for resources is accumulative for the three national centres.

From your personal space (eDARI account) on the eDARI portal, you can:

  • Create a Dynamic or Regular access file.
  • Renew a Dynamic or Regular Access file.
  • Request the opening of a compute account, necessary to access the computing resources on Jean Zay. Consult the IDRIS document on Account management.

You have access to a short video about resources allocation and account opening on Jean Zay on our YouTube channel "Un œil sur l'IDRIS" (it is in French but automatic subtitles work quite properly):

Dynamic Access (AD)

Requests for resources for Dynamic Access files may be made throughout the year and are renewable. The requests receive an expert assessment and are validated by the IDRIS director. The hours allocation is valid for one year beginning at the opening of the computing account on Jean Zay.

Regular Access (AR)

Two project calls for Regular Access are launched each year:

  • In January-February for an hours allocation from 1 May of the same year until 30 April of the following year.
  • In June-July for an hours allocation from 1 November of the current year until 31 October of the following year.

New Regular access files can be submitted throughout the year. They will receive an expert assessment during the biannual project call campaign whose closing date most immediately follows validation of the file by the project manager.

Renewal requests for Regular Access files, however, can only be submitted during a campaign, before its closure date, and will receive an expert assessment at that time. As information, the closure date for the A14 calls and A13 supplementary requests is the 14 February 2023.

Requests for supplementary hours ("au fil de l'eau")

Throughout the entire year you may request additional resources (demandes au fil de l'eau) on the eDARI portal www.edari.fr for all existing projects (Dynamic Access or HPC Regular Access) which have used up their initial hours quotas during the year. The accumulated hours request for Dynamic Access files must remain inferior to the thresholds of 50 000 normalized GPU hours or 500 000 CPU hours.

Documentation

Two documentation resources are available:

  • IDRIS documentation to assist you in completing each of these formalities via the eDARI portal.
  • The GENCI document (in French, detailing the procedures for access to the national resources).

Return to Table of Contents


4. How to obtain an account at IDRIS

Account management: Account opening and closure

User accounts

Each user has a unique account which can be associated with all the projects in which the user participates.

For more information, you may consult our web page regarding multi-project management.

Managing your account is done through completing an IDRIS form which must be sent to .

Of particular note, the FGC form is used to make modifications for an existing account: Add/delete machine access, change a postal address, telephone number, employer, etc.

An account can only exist as “open” or “closed”:

  • Open. In this case, it is possible to:
    • Submit jobs on the compute machine if the project's current hours allocation has not been exhausted (cf. idracct command output).
    • Submit pre- and post-processing jobs.
  • Closed. In this case, the user can no longer connect to the machine. An e-mail notification is sent to the project manager and the user at the time of account closure.

Opening a user account

For a new project

There is no automatic or implicit account opening. Each user of a project must request one of the following:

  • If the user does not yet have an IDRIS account, the opening of an account respects the access procedures (note GENCI), for regular access as well as dynamic access, use the eDARI portal after the concerned project has obtained computing hours.
  • if the user already has an account opened at IDRIS, the linking of his account to the project concerned via the eDARI portal. This request must be signed by both the user and the manager of the added project.

IMPORTANT INFORMATION: By decision of the IDRIS director or the CNRS Defence and Security Officer (FSD), the creation of a new account may be submitted for ministerial authorisation in application of the French regulations for the Protection du Potentiel Scientifique et Technique de la Nation (PPST). In this event, a personal communication will be transmitted to the user so that implementation of the required procedure may be started, knowing that this authorisation procedure may take up to two months. To anticipate this, you can send an email directly to the concerned centre ( see the contacts on the eDARI portal) to initiate the procedure in advance (for yourself or for someone you wish to integrate soon into your group) even before the account opening request is sent for validation to the director of the research structure and to the security manager of the research structure.

Comment: The opening of a new account on the machine will not be effective until (1) the access request (regular, dynamic, or preparatory) is validated (with ministerial authorization if requested) and (2) the concerned project has obtained computing hours.

You have access to a short video about resources allocation and account opening on Jean Zay on our YouTube channel "Un œil sur l'IDRIS" (it is in French but automatic subtitles work quite properly):

For a project renewal

Existing accounts are automatically carried over from one project call to another if the eligibility conditions of the project members have not changed (cf. GENCI explanatory document for project calls Modalités d'accès). If your account is open and already associated with the project which has obtained hours for the following call, no action on your part is necessary.

Closure of a user account

Account closure of an unrenewed project

When a GENCI project is not renewed, the following procedure is applied:

  • On the date of computing hours accounting expiry:
    • Unused DARI hours are no longer available and the project users can no longer submit jobs on the compute machine for this project (no more hours accounting for the project).
    • The project accounts remain open and linked to the project so that data can be accessed for a period of six months.
  • Six months after the date of hours accounting expiry:
    • Project accounts are detached from the project (no more access to project data).
    • All the project data (SCRATCH, STORE, WORK, ALL_CCFRSCRATCH, ALL_CCFRSTORE and ALL_CCFRWORK) will be deleted at the initiative of IDRIS within an undefined time period.
    • All the project accounts which are still linked to another project will remain open but, if the non-renewed project was their default project, they must change it via the idrproj command (otherwise the variables (SCRATCH, STORE, WORK, ALL_CCFRSCRATCH, ALL_CCFRSTORE and ALL_CCFRWORK will not be defined).
    • All the project accounts which are no longer linked to any project can be closed at any time.

File recovery is the responsibility of each user during the six months following the end of an unrenewed project, by transferring files to a user's local laboratory machine or to the Jean Zay disk spaces of another DARI project for multi-project accounts.

With this six-month delay period, we avoid premature closing of project accounts: This is the case for a project of allocation Ai which was not renewed for the following year (no computing hours requested for allocation Ai+2) but was renewed for allocation Ai+3 (which begins six months after allocation Ai+2).

Account closure after expiry of ministerial authorisation for accessing IDRIS computer resources

Ministerial authorisation is only valid for a certain period of time. When the ministerial authorisation reaches its expiry date, we are obligated to close your account.

In this situation, the procedure is as follows:

  • A first email notification is sent to the user 90 days before the expiry date.
  • A second email notification is sent to the user 70 days before the expiry date.
  • The account is closed on the expiry date if the authorisation has not been renewed. .

Important: to avoid this account closure, upon receipt of the first email, the user is invited to submit a new request for an account opening via eDARI portal so that IDRIS can start examining of a prolongation dossier. Indeed, the instruction of the dossier can take up to two months .

Account closure for security reasons

An account may be closed at any moment and without notice by decision of the IDRIS management.

Account closure following a detaching request made by the project manager

The project manager can request the detaching of an account attached to his project by completing and sending us the form (FGC)

During this request, the project manager may request that the data of the detached account and contained in the SCRATCH, STORE, WORK, ALL_CCFRSCRATCH, ALL_CCFRSTORE and ALL_CCFRWORK directories be immediately deleted or copied into the directories of another user attached to the same project.

But following the detachment, if the detached account is no longer attached to any project, it can then be closed at any time.

Declaring the machines from which a user connects to IDRIS

Each machine from which a user wishes to access an IDRIS computer must be registered at IDRIS.

The user must provide, for each of his/her accounts, a list of machines which will be used to connect to the IDRIS computers (the machine's name and IP address). This is done at the creation of each account viathe eDARI portal.

The user must update the list of machines associated with a login account (adding/deleting) by using the FGC form (account administration form). After completing this form, it must be signed by both the user and the security manager of the laboratory.

Important note: Personal IP addresses are not authorised for connection to IDRIS machines.

Security manager of the laboratory

The laboratory security manager is the network/security intermediary for IDRIS. This person must guarantee that the machine from which the user connects to IDRIS conforms to the most recent rules and practices concerning information security and must be able to immediately close the user access to IDRIS in case of a security alert.

The security manager's name and contact information are transmitted to IDRIS by the laboratory director on the FGC (account administration) form. This form is also used for informing IDRIS of any change in the security manager.

How to access IDRIS while teleworking or on mission

For security reasons, we cannot authorise access to IDRIS machines from non-institutional IP addresses. For example, you cannot have direct access from your personal connection.

Using a VPN

The recommended solution for accessing IDRIS resources when you are away from your registered address (teleworking, on mission, etc.) is to use the VPN (Virtual Private Network) of your laboratory/institute/university. A VPN allows you to access distant resources as if you were directly connected to the local network of your laboratory. Nevertheless, you still need to register the VPN-attributed IP address of your machine to IDRIS by following the procedure described above. This solution has the advantage of allowing the usage of IDRIS services which are accessible via a web navigator (for example, the extranet or products such as Jupyter Notebook, JupyterLab and TensorBoard).

Using a proxy machine

If using a VPN is impossible, it is always possible to connect via SSH to a proxy machine of your laboratory from which Jean Zay is accessible (which implies having registered the IP address of this proxy machine).

you@portable_computer:~$ ssh proxy_login@proxy_machine
proxy_login@proxy_machine~$ ssh idris_login@idris_machine

Note that it is possible to automate the proxy via the SSH options ProxyJump or ProxyCommand to be able to connect by using only one command (for example, ssh -J proxy_login@proxy_machine idris_login@idris_machine).

Obtaining temporary access to IDRIS machines from a foreign country

The user on mission must request machine authorisation by completing the corresponding box on page 3 of the FGC form. A temporary ssh access to all the IDRIS machines is then accorded.

Return to Table of Contents


5. How to connect to an IDRIS machine

How do I access Jean Zay ?

You can only connect on Jean Zay from a machine whose IP address is registered in our filters. If this is not the case, consult the procedure for declaring machines which is available on our Web site. Interactive access to Jean Zay is only possible on the front-end nodes of the machine via the ssh protocol.
For more detailed information, you may consult the description of the hardware and software of the cluster. Each IDRIS user is the holder of a unique login for all the projects in which he/she participates. This login is associated with a password which is subject to certain security rules. Before connecting, we advise you to consult the page management and problems of passwords.

Jean Zay: Access and shells

Access to the machines

Jean Zay:

Connection to the Jean Zay front end is done via ​ssh from a machine registered at IDRIS:

$ ssh my_idris_login@jean-zay.idris.fr

Then, enter your password if you have not configured the ​ssh key.

Jean Zay pre- and post-processing:

Interactive connection to the pre-/post-processing front end is done by ssh from a machine registered at IDRIS:

$ ssh my_idris_login@jean-zay-pp.idris.fr

Then, enter your password if you have not configured the ssh key.

SSH key authentification

Connections ​by SSH keys (private key / public key) are authorised ​at IDRIS.

ATTENTION: we plan to strengthen our security policy regarding access to the jean-zay machine. Therefore, we ask you to test, as of now, the use of certificates for your SSH connections instead of the usual SSH key pairs (private key / public key) by following the detailled procedures here.

SSH connection with certificate

With the objective of reinforcing the security when accessing Jean Zay, we ask you to test the use of certificates for your SSH connections instead of the usual public/private SSH key pairs. The creation and use of certificates are done by respecting the detailled procedures here.

During the test phase, connections via classic SSH keys will remain possible. Please let us know of any problems you may encounter with the use of certificates.

Managing the environment

Your ​$HOME space is common to all the Jean-Zay front ends.​ Consequently, every modification of your personal environment files is automatically applied on all the machines.

What shells are available on the IDRIS machines?

The Bourne Again shell (bash) is the only login shell available on the IDRIS machines: IDRIS does not guarantee that the default user environment will be correctly defined with other shells. The bash is an important evolution of the Bourne shell (formerly sh) with advanced functionalities. However, other shells (​ksh,​ ​tcsh,​ ​csh​) are also installed on the machines to allow the execution of scripts which are using them.

Which environment files are invoked during the launching of a login session in bash?

The .bash_profile, if it exists in your HOME, is executed at the login only one time per session. If not, it is the .profile file which is executed, if it exists. The environment variables and the programs are placed in one of these files, to be launched at the connection. Aliases, personal functions and the loading of modules are to be placed in the .bashrc file which, in contrast, is run at the launching of each sub-shell.

It is preferable to use only one environment file: .bash_profile or .profile.

Passwords

Connecting to Jean Zay is done with the user login and the associated password.

During the first connection, the user must indicate the “initial password” and then immediately change it to an “actual password”.

The initial password

What is the initial password?

The initial password is the result of the concatenation of two passwords respecting the order:

  1. The first part consists of a randomly generated IDRIS password which is sent to you by e-mail during the account opening and during a reinitialisation of your password. It remains valid for 20 days.
  2. The second part consists of the user-chosen password (8 ​characters) which you provided during your first account opening request using eDARI portal (if you are a new user) or when requesting a change in your initial password (using the FGC form).
    Note: For a user with a previously opened login account created in 2014 or before, the password indicated in the last postal letter from IDRIS should be used.

The initial password must be changed within 20 days following transmission of the randomly generated password (see below the section "Using an initial password at the first connection").
If this first connexion is not done within the 20-day timeframe, the initial password is invalidated and an e-mail is sent to inform you. In this case, you just have to send an e-mail to to request a new randomly generated password which is then sent to you by e-mail.

An initial password is generated (or re-generated) in the following cases:

  • Account opening (or reopening): an initial password is formed at the creation of each account and also for the reopening of a closed account.
  • Loss of the actual password:
    • If you have lost your actual password, you must contact to request the re-generation of a randomly generated password which is then sent to you by e-mail. You will also need to have the user-chosen part of the password you previously provided in the FGC form.
    • If you have also lost the user-chosen part of the password which you previously provided in the FGC form (or was contained in the postal letter from IDRIS in the former procedure of 2014 or before), you must complete the “Request to change the user part of initial password” section of the FGC form, print and sign it, then scan and e-mail it to or send it to IDRIS by postal mail. You will then receive an e-mail containing a new randomly generated password.

Using an initial password at the first connection

Below is an example of the first connection (without using ssh keys) for which the “initial password” be required for the login_idris account on IDRIS machine.

Important: At the first connection, the initial password is requested twice. A first time to establish the connection on the machine and a second time by the password change procedure which is then automatically executed.

Recommendation : As you have to change the initial password the first time you log in, before beginning the procedure, carefully prepare another password which you will enter (see Creation rules for "actual passwords" in section below).

$ ssh login_idris@machine_idris.idris.fr
login_idris@machine_idris password:  ## Enter INITIAL PASSWORD first time ##
Last login: Fri Nov 28 10:20:22 2014 from machine_idris.idris.fr
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for user login_idris.
Enter login(    ) password:          ## Enter INITIAL PASSWORD second time ##
Enter new password:                      ## Enter new chosen password   ##
Retype new password:                     ## Confirm new chosen password ##
     password information changed for login_idris
passwd: all authentication tokens updated successfully.
Connection to machine_idris closed.

Remark : You will be immediately disconnected after entering a new correct chosen password (“all authentication tokens updated successfully”).

Now, you may re-connect using your new actual password that you have just registered.

The actual password

Once your actual password has been created and entered correctly, it will remain valid for one year (365 days).

How to change your actual password

You can change your password at any time by using the UNIX command passwd directly on front end. The change is taken into account immediately for all the machines. This new actual password will remain valid for one year (365 days) following its creation.

Creation rules for "actual passwords"

  • It must contain a minimum of 12 characters.
  • The characters must belong to at least 3 of the 4 following groups:
    • Uppercase letters
    • Lowercase letters
    • Numbers
    • Special characters
  • The same character may not be repeated more than 2 times consecutively.
  • A password must not be composed of words from dictionaries or from trivial combinations (1234, azerty, …).

Notes:

  • Your actual password is not modifiable on the same day as its creation or for the 5 days following its creation. Nevertheless, if necessary, you may contact the User Support Team to request a new randomly generated password for the re-creation of an initial password.
  • A record is kept of the last 6 passwords used. Reusing one of the last 6 passwords will be rejected.

Forgotten or expired password

If you have forgotten your password or, despite the warning e-mails sent to you, you have not changed your actual password before its expiry date (i.e. one year after its last creation), your password will be invalidated.

In this case, you must contact to request the re-generation of the randomly generated password which is then sent to you by e-mail.

Note: You will also need to have the user-chosen part of the initial password you initially provided, to be able to connect on the host after this re-generation. In fact, you will have to follow the same procedure than for using an initial password at the first connection.

Account blockage following 15 unsuccessful connection attempts

If your account has been blocked as a result of 15 unsuccessful connection attempts, you must contact the IDRIS User Support Team.

Account security reminder

You must never write out your password in an e-mail, even ones sent to IDRIS (User Support, Gestutil, etc.) no matter what the reason: We would be obligated to immediately generate a new initial password, the objective being to inhibit the actual password which you published and to ensure that you define a new one during your next connection.

Each account is strictly personal. Discovery of account access by an unauthorised person will cause immediate protective measures to be taken by IDRIS including the eventual blockage of the account.
The user must take certain basic common sense precautions:

  • Inform IDRIS immediately of any attempted trespassing on your account.
  • Respect the recommendations for using SSH keys.
  • Protect your files by limiting UNIX access rights.
  • Do not use a password which is too simple.
  • Protect your personal work station.

Return to Table of Contents


6. Management of your account and your environment variables

How do I modify my personal data?

Modification of your personal data is done via the Web interface Extranet.

  • For those who do not have a password for Extranet or who have lost it, the access modalities are described on this page.
  • For those who have a password, click on Extranet, connect with your identifiers, then ⇒ Your account ⇒ Your data ⇒ Your contact details.

The only data modifiable on line are:

  • e-mail address
  • telephone number
  • fax number

Modification of your postal address is done by completing the section « Modification of the user's postal address », on the Administration Form for Login Accounts FGC, and sending it to from an institutional address. Note that this procedure requires the signatures of the user and of your laboratory director.

What disk spaces are available on Jean Zay ?

For each project, there are 5 distinct disk spaces available on Jean Zay: HOME, WORK, SCRATCH/JOBSCRATCH, STORE and DSDIR.

You will find the explanations concerning these spaces on the disk spaces page of our Web site.
Important: HOME, WORK and STORE are subject to quotas !

If your login is attached to more than one project, the IDRIS command idrenv will display all the environment variables referencing all the disk spaces of your projects. These variables allow you to access the data of your different projects from any of your other projects.

Choose your storage space according to your needs (permanent, semi-permanent or temporary data, large or small files, etc.).

How do I request an extension of a disk space or inodes?

If your use of the disk space is in accordance with its usage recommendations and if you cannot delete or move the data contents from this space, your project manager can make a justified request for a quota increase (space and/or inodes) via the extranet.

How can I know the number of computing hours consumed per project?

You simply need to use the IDRIS command idracct to know the hours consumed by each collaborator of the project as well as the total number of hours consumed and the percentage of the allocation.

Note that the information returned by this command is updated once per day (the date and time of the update are indicated in the first line of the command output).

If you have more than one project at IDRIS, this command will display the CPU and/or GPU consumptions of all the projects that your login is attached to.

What should I do when I soon will have no computing hours remaining?

There are two possible ways to request supplementary hours :

  • Either by requesting more resources as needed («au fil de l'eau») ; this request may be made as soon as your AD or AR project has an hours allocation in progress.
  • Or by requesting a complement of hours for a period of six months ; this can be requested midway through your Regular Access allocation.

These requests must be justified and should be made via the eDARI portal as indicated on our Web page requesting resource hours.

How can I know when the machine is unavailable?

The machine can be unavailable because of a planned maintenance event or due to a technical problem which occurred unexpectedly. In both cases, the information is available on the homepage of the IDRIS Web site via the drop-down menu entitled, « For users », then the heading "Machine availability".

IDRIS users may also subscribe to the “info-machines” mailing list through the Extranet.

How do I recover files which I unintentionally deleted?

You can only recover files which were deleted from your HOME and from the WORK spaces of your different projects. In fact, only the HOME and WORK spaces are backed up via « snapshots » which is explained on the disk spaces page of our Web site.

Because their sizes are too large, neither the SCRATCH (semi-permanent space), nor the STORE (archiving space) are backed up.

Can I ask IDRIS to transfer files from one account to another account?

IDRIS considers data to be linked to a project. Consequently, for the transfer to be possible, the following are necessary:

  • Both accounts (the owner and the recipient) must be attached to the same project.
  • The project manager makes the request by signed fax or by e-mail to the support team () specifying:
    • The concerned machine.
    • The source account and the recipient account.
    • The list of files and/or directories to transfer.

Can I recover files on an external support ?

It is no longer possible to request the transfer of files to an external support.

Return to Table of Contents


7. The disk spaces

Jean Zay: The disk spaces

There are four distinct disk spaces accessible for each project: HOME, WORK, SCRATCH/JOBSCRATCH and the STORE.

Each space has specific characteristics suitable to its usage which are described below. The paths to access these spaces are stocked in five variables of the shell environment: $HOME, $WORK, $SCRATCH, $JOBSCRATCH and $STORE.

You can know the occupation of the different disk spaces by using the IDRIS “ idr_quota” commands or with the Unix du (disk usage) command. The return of the “idr_quota” commands is immediate but is not information in real time (the idr_quota_user and idr_quota_project commands are updated once a day and the idrquota command is updated every 30 minutes). The du command returns information in real time but its execution can be long, depending on the size of the concerned directory.

The HOME

$HOME : This is the home directory during an interactive connection. This space is intended for frequently-used small-sized files such as the shell environment files, the tools, and potentially the sources and libraries if they have a reasonable size. The size of this space is limited (in space and in number of files).
The HOME characteristics are:

  • It is a permanent space.
  • It is saved via snapshots: See the section entitled The snapshots below.
  • Intended to receive small-sized files.
  • In the case of a multi-project login, the HOME is unique.
  • Submitted to quotas per user which are intentionally rather low (3 GiB by default).
  • Accessible in interactive or in a batch job via the $HOME variable :
    $ cd $HOME
  • It is the home directory during an interactive connection.

Note: The HOME space is also referenced via the CCFRHOME environment variable to respect a common nomenclature with other national computing centers (CINES, TGCC)

$ cd $CCFRHOME

The WORK

$WORK: This is a permanent work and storage space which is usable in batch. In this space, we generally store large-sized files for use during batch executions: very large source files, libraries, data files, executable files, result files and submission scripts.
The characteristics of WORK are:

  • It is a permanent space.
  • It is saved via snapshots: See the section entitled The snapshots below.
  • Intended to receive large-sized files.
  • In the case of a multi-project login, a WORK is created for each project.
  • Submitted to quotas per project.
  • It is accessible in interactive or in a batch job.
  • It is composed of 2 sections:
    • A section in which each user has an individual part, accessed by the command:
      $ cd $WORK
    • A section common to the project to which the user belongs and into which files to be shared can be placed, accessed by the command:
      $ cd $ALL_CCFRWORK
  • The WORK is a GPFS disk space with a bandwidth of about 100 GB/s in read and in write. This bandwidth can be temporarily saturated in case of exceptionally intensive usage.

Note: The WORK space is also referenced via the CCFRWORK environment variable to respect a common nomenclature with other national computing centers (CINES, TGCC)

$ cd $CCFRWORK

Usage recommendations

  • Batch jobs can run in the WORK. Nevertheless, because several of your jobs can be run at the same time, you must manage the unique identities of your execution directories or your file names.
  • Moreover, this disk space is submitted to quotas (per project) which can suddenly stop your execution if the quotas are reached. Therefore, in the WORK, you must not only be aware of your own activity but also that of your project colleagues. For these reasons, you may prefer using the SCRATCH or the JOBSCRATCH for the execution of batch jobs.

The SCRATCH/JOBSCRATCH

$SCRATCH : This is a semi-permanent work and storage space which is usable in batch; the lifespan of the files is limited to 30 days. The large-sized files used during batch executions are generally stored here: the data files, result files or the computation restarts. Once the post-processing has been done to reduce the data volume, you must remember to copy the significant files into the WORK so that they are not lost after 30 days, or into the STORE for long-term archiving.
The characteristics of the SCRATCH are:

  • The SCRATCH is a semi-permanent space with a 30-day file lifespan.
  • It is not backed up.
  • It is intended to receive large-sized files.
  • It is submitted to very large security quotas:
    • disk quotas per project, about 1/10th of the total disk space for each group
    • and inode quotas per project, about 150 million files and directories.
  • It is accessible in interactive or in a batch job.
  • It is composed of 2 sections:
    • A section in which each user has an individual part; accessed by the command:
      $ cd $SCRATCH
    • A section common to the project to which the user belongs into which files to be shared can be placed. It is accessed by the command:
      $ cd $ALL_CCFRSCRATCH
  • In the case of a multi-project login, a SCRATCH is created for each project.
  • The SCRATCH is a GPFS disk space with a bandwidth of about 500 GB/s in write and in read.

Note: The SCRATCH space is also referenced via the CCFRSCRATCH environment variable to respect a common nomenclature with other national computing centers (CINES, TGCC)

$ cd $CCFRSCRATCH

$JOBSCRATCH: This is the temporary execution directory specific to batch jobs.
Its characteristics are:

  • It is a temporary directory with file lifespan equivalent to the batch job lifespan.
  • It is not backed up.
  • It is intended to receive large-sized files.
  • It is submitted to very large security quotas:
    • disk quotas per project, about 1/10th of the total disk space for each group
    • and inode quotas per project, about 150 million files and directories.
  • It is created automatically when a batch job starts and, therefore, is unique to each job.
  • It is destroyed automatically at the end of the job. Therefore, it is necessary to manually copy the important files onto another disk space (the WORK or the SCRATCH) before the end of the job.
  • The JOBSCRATCH is a GPFS disk space with a bandwidth of about 500 GB/s in write and in read.
  • During the execution of a batch job, the corresponding JOBSCRATCH is accessible from the Jean Zay front end via its JOBID job number (see the output of the squeue command) and the following command:
    $ cd /gpfsssd/jobscratch/JOBID

Usage recommendations:

  • The JOBSCRATCH can be seen as the former TMPDIR.
  • The SCRATCH can be seen as a semi-permanent WORK which offers the maximum input/output performance available at IDRIS but limited by a 30-day file lifespan.
  • The semi-permanent characteristics of the SCRATCH allow storing large volumes of data there between two or more jobs which run successively, one right after another, but within a limited period of a few weeks: This disk space is not purged after each job.

The STORE

$STORE: This is the IDRIS archiving space for long-term storage. Very large-sized files are generally stored there, consequent to using tar for a tree hierarchy of compute result files after post-processing. This is a space which is not meant to be accessed or modified on a daily basis but to preserve very large volumes of data over time with only occasional consultation.
Its characteristics are:

  • The STORE is a permanent space.
  • It is not backed up .
  • We advise against systematically accessing it in write during a batch job.
  • It is intended to received very large-sized files: The maximum size is 10 TiB per file and the minimum recommended size is 250 MiB (ratio disc size/ number of inodes).
  • In the case of a multi-project login, a STORE is created per project.
  • It is submitted to quotas per project with a small number of inodes, but a very large space.
  • It is composed of 2 sections:
    • A section in which each user has an individual part, accessed by the command:
      $ cd $STORE
    • A section common to the project to which the user belongs and into which files to be shared can be placed. It is accessed by the command:
      $ cd $ALL_CCFRSTORE

Note: The STORE space is also referenced via the CCFRSTORE environment variable to respect a common nomenclature with other national computing centers (CINES, TGCC)

$ cd $CCFRSTORE

Usage recommendations:

  • The STORE can be seen as replacing the former Ergon archive server.
  • However, there is no longer a limitation on file lifespan.
  • As this is an archive space, it is not intended for frequent access.

The DSDIR

$DSDIR: This is a storage space dedicated to voluminous public data bases (in size or number of files) which are needed for using AI tools. These datasets are visible to all Jean Zay users.

If you use large public data bases which are not found in the $DSDIR space, IDRIS will download and install them in this disk space at your request.

The list of currently accessible data bases is found on this page: Jean Zay: Datasets and models available in the $DSDIR storage space.

If your database is personal or under a license which is too restrictive, you must take charge of its management yourself in the disk spaces of your project, as described on the Database Management page.

Summary table of the main disk spaces

Space Default capacity Features Usage
$HOME 3GB and 150k inodes
per user
- Home directory at connection
- Backed up space
- Storage of configuration files and small files
$WORK 5TB (*) and 500k inodes
per project
- Storage on rotating disks
(100GB/s read/write operations)
- Backed up space
- Storage of source codes and input/output data
- Execution in batch or interactive
$SCRATCH Very large security quotas
2.5PB shared by all users
- SSD Storage
(500GB/s read/write operations)
- Lifespan of unused files
(= not read or modified): 30 days
- Space not backed up
- Storage of voluminous input/output data
- Execution in batch or interactive
- Optimal performance for read/write operations
$STORE 50TB (*) and 100k inodes (*)
per project
- Space not backed up - Long-term archive storage (for lifespan of project)
(*) Quotas per project can be increased at the request of the project manager or deputy manager via the Extranet interface, or per request to the user support team.

The snapshots

The $HOME and $WORK are saved regularly via a snapshot mechanism: These are snapshots of the tree hierarchies which allow you to recover a file or a directory that you have corrupted or deleted by error.

All the available snapshots SNAP_YYYYMMDD, where YYYYMMDD correspond to the backup date, are visible from all the directories of your HOME and your WORK via the following command:

$ ls .snapshots
SNAP_20191022  SNAP_20191220  SNAP_20200303  SNAP_20200511
SNAP_20191112  SNAP_20200127  SNAP_20200406  SNAP_20200609 

Comment: In this example, you can see 8 backups. To recover a file from 9 June 2020, you simply need to select the directory SNAP_20200609.

Important: The .snapshots directory is not visible with the ls -a command so don't be surprised when you don't see it. Only its contents can be consulted.

For example, if you wish to recover a file which was in the $WORK/MY_DIR subdirectory, you just need to follow the procedure below:

  1. Go into the directory of the initial file:
    $ cd $WORK/MY_DIR
  2. You will find the backup which interests you via the ls command:
    $ ls .snapshots
    SNAP_20191022  SNAP_20191220  SNAP_20200303  SNAP_20200511
    SNAP_20191112  SNAP_20200127  SNAP_20200406  SNAP_20200609 
  3. You can then see the contents of your $WORK/MY_DIR directory as it was on 9 June 2020, for example, with the command:
    $ ls -al .snapshots/SNAP_20200609 
    total 2
    drwx--S--- 2 login  prj  4096 oct.  24  2019 .
    dr-xr-xr-x 2 root  root 16384 janv.  1  1970 ..
    -rw------- 1 login  prj 20480 oct.  24  2019 my_file 
  4. Finally, you can recover the file as it was on the date of 9 June 2020 by using the cp command:
    1. By overwriting the initial file, $WORK/MY_DIR/my_file (note the “.” at the end of the command):
      $ cp .snapshots/SNAP_20200609/my_file . 
    2. Or, by renaming the copy as $WORK/MY_DIR/my_file_20200609 in order to not overwrite the initial file $WORK/MY_DIR/my_file:
      $ cp .snapshots/SNAP_20200609/my_file  my_file_20200609 

Comments:

  • The ls -l .snapshots/SNAP_YYYYMMDD command always indicates the contents of the directory where you are but on the given date YYYY/MM/DD.
  • You can add the -p option to the cp command in order to keep the date and the Unix access rights of the recovered file:
    $ cp -p .snapshots/SNAP_20200609/my_file . 
    $ cp -p .snapshots/SNAP_20200609/my_file  my_file_20200609 
  • Files are recovered from your HOME by using the same procedure.

Jean Zay: Disk quotas and commands for the visualisation of occupation rates

Introduction

Quotas guarantee equitable access to disk resources. They prevent the situation where one group of users consumes all the space and prevents other groups from working. At IDRIS, quotas limit both the quantity of disk space and the number of files (inodes). These limits are applied per user for the HOME (one HOME per user even if your login is attached to more than one project) and per project for the WORK and the STORE (as many WORK and STORE spaces as there are projects for the same user).

You can consult the disk quotas of your project by using the two commands presented in this document:

You also still have access to the idrquota command. This is the first quota visualisation command which was deployed on Jean Zay. The idr_quota_user and idr_quota_project commands are an evolution of this.

Exceeding the quotas

When a group has exceeded a quota, no warning e-mail is sent. Nevertheless, you are informed by error messages such as « Disk quota exceeded » when you manipulate files in the concerned disk space.

When one of the quotas is reached, you can no longer create files in the concerned disk space. Doing this could disturb other jobs being executed at that time if they were launched from this space.

Warning: Editing a file after you have reached your disk quota limit can cause the file size to return to zero, thereby deleting its contents.

When you are blocked or in the process of being blocked:

  • Try cleaning out the concerned disk space by deleting files which are no longer useful.
  • Archive the directories which you no longer, or rarely, access.
  • Move your files into another space in function of their usages (see the page on disk spaces).
  • The project manager or his/her deputy may request a quota increase for the STORE space via the Extranet interface.

Comments:

  1. Remember to verify the common disk spaces, $ALL_CCFRWORK and $ALL_CCFRSTORE.
  2. A recurrent cause of exceeding quotas is the use of personal Anaconda environments. Please refer to the Python personal environment page to understand the best practices on Jean Zay.

The idr_quota_user command

By default, the idr_quota_user command returns your personal occupation as a user for all of the disk spaces of your active project. For example, if your active project is abc, you will see an output similar to the following:

$ idr_quota_user
 HOME
INODE:   |██-------------------------------| U: 9419/150000  6.28%                                     
STORAGE: |████████████████████████████████-| U: 2.98 GiB/3.00 GiB  99.31%                              
 
ABC STORE
INODE:   |---------------------------------| U: 1/100000  0.00%           G: 12/100000  0.01%          
STORAGE: |---------------------------------| U: 4.00 KiB/50.00 TiB  0.00% G: 48.00 KiB/50.00 TiB  0.00%
 
ABC WORK
INODE:   |███▒▒▒---------------------------| U: 50000/500000 10.00%       G: 100000/500000  20.00%        
STORAGE: |██████████▒▒▒▒▒▒▒▒▒▒-------------| U: 1.25 TiB/5.00 TiB  25.00% G: 2.5 TiB/5.00 TiB 50.00%
 
The quotas are refreshed daily. All the information is not in real time and may not reflect your real
storage occupation.

In this output example, your personal occupation is represented by the black bar and quantified on the right after the letter U (for User). Your personal occupation is also compared to the global occupation of the project which is represented by the grey bar (in this output example) and quantified after the letter G (for Group).

Note that the colours can be different depending on the parameters and/or type of your terminal.

You can refine the information returned by the idr_quota_user command by adding one or more of the following arguments:

  • --project def to display the occupation of a different project than your active project ( def in this example)
  • --all-projects to display the occupation of all the projects to which you are attached
  • --space home work to display the occupation of one or more particular disk spaces (the HOME and the WORK in this example)

Complete help for the idr_quota_user command is accessible by launching:

$ idr_quota_user -h

The idr_quota_project command

By default, the idr_quota_project command returns the disk occupation of each member of your active project for all of the disk spaces associated with the project. For example, if your active project is abc, you will see an output similar to the following:

$ idr_quota_project
PROJECT: abc SPACE: WORK
PROJECT USED INODE: 34373/500000 6.87%
PROJECT USED STORAGE: 1.42 GiB/5.00 TiB 0.03%
┌─────────────────┬─────────────────┬─────────────────┬─────────────────┬──────────────────────┐
│      LOGIN      │     INODE ▽     │     INODE %     │     STORAGE     │      STORAGE %       │
├─────────────────┼─────────────────┼─────────────────┼─────────────────┼──────────────────────┤
│      abc001     │            29852│            5.97%│       698.45 MiB│                 0.01%│
│      abc002     │             4508│            0.90%│       747.03 MiB│                 0.01%│
│      abc003     │                8│            0.00%│         6.19 MiB│                 0.00%│
│      abc004     │                1│            0.00%│           0.00 B│                 0.00%│
│      abc005     │                1│            0.00%│           0.00 B│                 0.00%│
└─────────────────┴─────────────────┴─────────────────┴─────────────────┴──────────────────────┘
PROJECT: abc SPACE: STORE
PROJECT USED INODE: 13/100000 0.01%
PROJECT USED STORAGE: 52.00 KiB/50.00 TiB 0.00%
┌─────────────────┬─────────────────┬─────────────────┬─────────────────┬──────────────────────┐
│      LOGIN      │     INODE ▽     │     INODE %     │     STORAGE     │      STORAGE %       │
├─────────────────┼─────────────────┼─────────────────┼─────────────────┼──────────────────────┤
│      abc001     │                2│            0.00%│         8.00 KiB│                 0.00%│
│      abc002     │                2│            0.00%│         8.00 KiB│                 0.00%│
│      abc003     │                2│            0.00%│         8.00 KiB│                 0.00%│
│      abc004     │                2│            0.00%│         8.00 KiB│                 0.00%│
│      abc005     │                1│            0.00%│         4.00 KiB│                 0.00%│
└─────────────────┴─────────────────┴─────────────────┴─────────────────┴──────────────────────┘
The quotas are refreshed daily. All the information is not in real time and may not reflect your real
storage occupation.

A summary of the global occupation is displayed for each disk space, followed by a table with the occupation details of each member of the project.

You can refine the information returned by the idr_quota_project command by adding one or more of the following arguments:

  • --project def to display the occupation of a different project than your active project (def in this example)
  • --space work to display the occupation of one (or more) particular disk space(s) (the WORK in this example)
  • --order storage to display the values of a given column in decreasing order (the STORAGE column in this example)

Complete help for the idr_quota_project command is accessible by launching:

$ idr_quota_project -h

The idrquota command

The idrquota command provides an overall view of the occupation rates of the different disk spaces.

  • The -m option enables recovering information for the HOME (quotas per user).
  • The -s option enables recovering information for the STORE (quotas per project).
  • The -w option enables recovering information for the WORK (quotas per project).
  • The -p <PROJET> option allows specifying the desired project if your login is attached to multiple projects. It must be combined with the -w or -s option but not with the -m option.
  • The -h option enables obtaining the command help.

The following are two examples using idrquota for visualisation of the HOME and WORK quotas for the active project (choice by default):

$ idrquota -m
HOME: 2 / 3 Gio (58.24%)
HOME: 23981 / 150000 inodes (15.99%)
$ idrquota -w
WORK: 1293 / 5120 Gio (25.26%)
WORK: 431228 / 500000 inodes (86.25%)

The following are two examples using idrquota for visualisation of the STORE and WORK quotas for the abc project:

$ idrquota -p abc -s
STORE: 7976 / 58368 Gio (13.67%)
STORE: 21900 / 110000 inodes (19.91%)
$ idrquota -p abc -w
WORK: 2530 / 5000 Gio (50.60%)
WORK: 454248 / 550000 inodes (82.59%)

General comments

  • The projects to which you are attached correspond to the UNIX groups listed by the idrproj command.
  • The quotas are not monitored in real time and may not represent the actual occupation of your disk spaces. The idrquota command is updated every 30 minutes while the idr_quota_user and idr_quota_project commands are updated each morning.
  • To know the volumetry in octets and inodes of a given directory (example: my_directory), you can execute the commands: du -hd0 my_directory and du -hd0 --inodes my_directory, respectively. Contrary to the “idr_quota” commands, the du commands can have a long execution time which is related to the directory size.
  • For the WORK and the STORE, the displayed occupation rates include both the personal space ($WORK or $STORE) and the occupation of the common space ($ALL_CCFRWORK or $ALL_CCFRSTORE).

Return to Table of Contents


8. Commands for file transfers

File transfers using the bbftp command

To transfer large-sized files from IDRIS to your laboratory, we advise you to use BBFTP which is an optimised software for transferring files.

All the information for using the bbftp command is found on our website.

File transfers via CCFR network

How do I transfer data between the 3 national centres (CCFR network) ?

Introduction

The CCFR (Centres de Calcul Français) network is dedicated to very high speed and interconnects the three French national computing centres: CINES, IDRIS and TGCC. This network is made available to users to facilitate data transfers between the national centres. The machines currently connected on this network are Joliot-Curie at TGCC, Jean Zay at IDRIS, and Adastra at CINES.

Using this network requires that you have logins (different for each center) in at least two of the three centres and that they are authorized to access the CCFR network in the concerned centres.

Comments:

  • For your IDRIS login, the request for access to the CCFR network can be made:
    • when you request to create an account from eDARI portal,
    • or by filling the section entitled “Acces the CCFR network” in the Administration Form for Login Accounts (FGC) and send it to from an institutional address. Note that this procedure requires the signatures of the user and of the security manager of your laboratory.
  • Moreover, not all of the Jean Zay nodes are connected to this network. To use it from IDRIS, you can use the front-end nodes jean-zay.idris.fr and jean-zay-pp.idris.fr.

For more information, please contact the User Support Team ().

Data transfers via CCFR network

Data transfers between the machines of the centres via the CCFR network constitute the principal service of this network. A command wrapper ccfr_cp accessible via a modulefile is provided to simplify the usages:

$ module load ccfr

This ccfr_cp command automatically recuperates the connection information of the specified machine (name domain, port number) and detects the authentication possibilities. By default, the command will opt for basic authentication, using the traditional methods in force on the targeted machine.
The ccfr_cp command is based on the rsync tool and configured to use the SSH protocol for transfers. The copy is recursive and keeps the symbolic links, the access rights and the dates of file modifications.
The command details and the list of the machines accessible on the CCFR network are available by specifying the -h option to the ccfr_cp command.

For transfers from jean-zay to CINES and TGCC machines, you can use commands similar to theses:

$ module load ccfr
$ ccfr_cp /path/to/datas/on/jean-zay login_cines@adastra:/path/to/directory/on/adastra:
$ ccfr_cp /path/to/datas/on/jean-zay login_tgcc@irene:/path/to/directory/on/irene:

For transfers from Adastra, the procedure is similar except that you must use the machine adastra-ccfr.cines.fr (accessible from adastra.cines.fr) as shown on CINES documentation.
For transfers from Irene, the procedure is also similar and can be carried out directly from the front-end nodes irene-fr.ccc.cea.fr. After connecting to the machine, the machine.info command will give you all the useful information.

A ccfr_sync command, variant of ccfr_cp, enables a strong synchronisation between the source and the destination by adding, compared to the ccfr_cp command, the deletion of the destination files which are no longer present in the source. The -h option is also available for this command.

Remark: These commands will use a basic authentication with password in compliance with the terms and conditions in force at the remote centre (CINES or TGCC). You will therefore certainly be required to provide a password each time. To avoid this, you can use IDRIS transfer-only certificates (valid for 7 days) whose instructions for use are defined on the IDRIS website. Using such certificates will force you to initiate transfers from the remote machine adastra-ccfr.cines.fr (accessible from adastra.cines.fr) for CINES and irene-fr.ccc.cea.fr for TGCC after having copied the transfer-only certificate on the remote machine and to build the rsync transfer commands yourself (so do not use the ccfr_cp and ccfr_sync wrappers). You can then draw inspiration from the following examples to make your transfers:

# Simple copy from jean-zay to remote machine (initiated on remote machine)
# using transfert-only certificate registered in ~/.ssh/id_ecc_rsync on remote machine
$ rsync --human-readable --recursive --links --perms --times --omit-dir-times -v \
  -e 'ssh -i ~/.ssh/id_ecc_rsync' \
  login_idris@jean-zay-ccfr.idris.fr:/path/on/jean-zay /path/on/adastra/or/irene
 
# Strong synchronization (--delete option) from jean-zay to remote machine (initiated on remote machine)
# using transfert-only certificate registered in ~/.ssh/id_ecc_rsync on remote machine
$ rsync --human-readable --recursive --links --perms --times --omit-dir-times -v --delete \
  -e 'ssh -i ~/.ssh/id_ecc_rsync' \
  login_idris@jean-zay-ccfr.idris.fr:/path/on/jean-zay /path/on/adastra/or/irene

Attention : On adastra-ccfr.cines.fr, the id_ecc_rsync certificate must be visible from your directory /home/login_cines/.ssh so that the ssh command can use it (no environment variable is defined for this disk space). You must therefore take care to unarchive the certificate in this directory with a command like:

login_cines@adastra-ccfr.cines.fr:~$ unzip ~/transfert_certif.zip -d /home/login_cines/.ssh
Archive: /lus/home/.../transfert_certif.zip
inflating: /home/login_cines/.ssh/id_ecc_rsync
inflating: /home/login_cines/.ssh/id_ecc_rsync.pub

Return to Table of Contents


9. The module command

For more information, consult our web page about instructions to use the module command on Jean Zay.

Return to Table of Contents


10. Compilation

Jean Zay: The Fortran and C/C++ compilation system (Intel)

$ module avail intel-compilers
----------------------- /gpfslocalsup/pub/module-rh/modulefiles  --------------------------
intel-compilers/16.0.4 intel-compilers/18.0.5 intel-compilers/19.0.2 intel-compilers/19.0.4
 
$ module load intel-compilers/19.0.4
 
$ module list
Currently Loaded Modulefiles:
 1) intel-compilers/19.0.4

$ ifort prog.f90 -o prog
 
$ icc  prog.c -o prog
 
$ icpc prog.C -o prog

Jean Zay: Compilation of an MPI parallel code in Fortran, C/C++

  • Intel MPI :
$ module avail intel-mpi
-------------------------------------------------------------------------- /gpfslocalsup/pub/module-rh/modulefiles --------------------------------------------------------------------------
intel-mpi/5.1.3(16.0.4)   intel-mpi/2018.5(18.0.5)  intel-mpi/2019.4(19.0.4)  intel-mpi/2019.6  intel-mpi/2019.8  
intel-mpi/2018.1(18.0.1)  intel-mpi/2019.2(19.0.2)  intel-mpi/2019.5(19.0.5)  intel-mpi/2019.7  intel-mpi/2019.9
 
$ module load intel-compilers/19.0.4 intel-mpi/19.0.4
  • Open MPI (without CUDA-aware MPI, you must choose one of the modules with no -cuda extension) :
$ module avail openmpi
-------------------------------------------------------- /gpfslocalsup/pub/modules-idris-env4/modulefiles/linux-rhel8-skylake_avx512 --------------------------------------------------------
openmpi/3.1.4       openmpi/3.1.5  openmpi/3.1.6-cuda  openmpi/4.0.2       openmpi/4.0.4       openmpi/4.0.5       openmpi/4.1.0       openmpi/4.1.1       
openmpi/3.1.4-cuda  openmpi/3.1.6  openmpi/4.0.1-cuda  openmpi/4.0.2-cuda  openmpi/4.0.4-cuda  openmpi/4.0.5-cuda  openmpi/4.1.0-cuda  openmpi/4.1.1-cuda     
 
$ module load pgi/20.4 openmpi/4.0.4

  • Intel MPI :
$ mpiifort source.f90
 
$ mpiicc source.c
 
$ mpiicpc source.C
  • Open MPI :
$ mpifort source.f90
 
$ mpicc source.c
 
$ mpic++ source.C

Jean Zay: Compilation of an OpenMP parallel code in Fortran, C/C++

$ ifort -qopenmp source.f90
 
$ icc -qopenmp source.c
 
$ icpc -qopenmp source.C

$ ifort -c -qopenmp source1.f
$ ifort -c source2.f
$ icc -c source3.c
$ ifort -qopenmp source1.o source2.o source3.o

Jean Zay: Using the PGI compilation system for C/C++ and Fortran

$ module avail pgi 
---------------- /gpfslocalsup/pub/module-rh/modulefiles ----------------
pgi/19.10  pgi/20.1  pgi/20.4
 
$ module load pgi/19.10
 
$ module list
Currently Loaded Modulefiles:
  1) pgi/19.10

$ pgcc prog.c -o prog
 
$ pgc++ prog.cpp -o prog
 
$ pgfortran prog.f90 -o prog

Jean Zay : Compiling an OpenACC code

The PGI compiling options for activating OpenACC are the following:

  • -acc: This option activates the OpenACC support. You can specify some suboptions:
    • [no]autopar: Activate automatic parallelization for the ACC PARALLEL directive. The default is to activate it.
    • [no]routineseq: Compile all the routines for the accelerator. The default is to not compile each routine as a sequential directive.
    • strict: Display warning messages if using non-OpenACC directives for the accelerator.
    • verystrict: Stops the compilation if using any non-OpenACC directives for the accelerator.
    • sync: Ignore async clauses.
    • [no]wait: Wait for the completion of each calculation kernel on the accelerator. By default, kernel launching is blocked except if async is used.
    • Example :
      $ pgfortran -acc=noautopar,sync -o prog_ACC prog_ACC.f90
  • -ta: This option activates offloading on the accelerator. It automatically activates the -acc option.
    • It will be useful for choosing the architecture for compiling the code.
    • To use the V100 GPUs of Jean Zay, it is necessary to use the tesla suboption of -ta and the cc70 compute capability. For example:
      $ pgfortran -ta=tesla:cc70 -o prog_gpu prog_gpu.f90
    • Some useful tesla suboptions:
      • managed: Creates a shared view of the GPU and CPU memory.
      • pinned: Activates CPU memory pinning. This can improve data transfer performance.
      • autocompare: Activates comparison of CPU/GPU results.

Jean Zay: CUDA-aware MPI and GPUDirect

For optimal performance, the Cuda-aware OpenMPI libraries supporting the GPUDirect are available on Jean Zay.

$ module avail openmpi/*-cuda
-------------- /gpfslocalsup/pub/modules-idris-env4/modulefiles/linux-rhel8-skylake_avx512 --------------
openmpi/3.1.4-cuda openmpi/3.1.6-cuda openmpi/4.0.2-cuda openmpi/4.0.4-cuda
 
$ module load openmpi/4.0.4-cuda

$ mpifort source.f90
 
$ mpicc source.c
 
$ mpic++ source.C

Since no particular option is necessary for the compilation, you may refer to the GPU compilation section of the index for more information on code compiling using the GPUs.

Adaptation of the code

The utilisation of the CUDA-aware MPI GPUDirect functionality on Jean Zay requires an accurate initialisation order for CUDA or OpenACC and MPI in the code :

  1. Initialisation of CUDA or OpenACC
  2. Choice of the GPU which each MPI process should use (binding step)
  3. Initialisation of MPI.

Caution: if this initialisation order is not respected, your code execution might crash with the following error:

CUDA failure: cuCtxGetDevice()

Return to Table of Contents


11. Code execution

Interactive and batch

There are two possible job modes: Interactive and batch.

In both cases, you must respect the maximum limits in elapsed (or clock) time, memory, and number of processors and/or number of GPUs which are set by IDRIS with the goal of better managing the computing resources. You will find more complete information concerning these limits by consulting the following pages on our Web server: CPU Slurm partitions, GPU Slurm partitions and the pages detailing how to reserve memory for CPU jobs or for GPU jobs.

Interactive jobs

From the machines declared in the IDRIS filters, you have SSH access to the front ends from which you can:

Comment: Any code requiring GPUs cannot be executed on the front ends as they are not equipped with them.

Batch jobs

There are several reasons to work in batch mode:

  • Having the possibility of closing the interactive session after submitting a batch job.
  • Having the possibility of going beyond the limitations of interactive in elapsed (or clock) time, memory, number of processors or GPUs.
  • Doing the computations with dedicated resources (these resources are reserved for you alone).
  • Allowing for better resource management for users by distribution on the machine according to the resources requested.
  • Launching your pre-/post-processing jobs on nodes dedicated to large memory (jean-zay-pp).

At IDRIS, we use Slurm software for batch job management on the compute nodes, the pre-/post-processing nodes (jean-zay-pp) and the visualisation nodes (jean-zay-visu).

This batch manager controls the scheduling of jobs according to resources requested (memory, elapsed (or clock) time, number of CPUs, number of GPUs, …), the number of active jobs at a given moment (number in total and number per user) and the number of hours consumed per project.

There are 2 essential steps in order to work in batch: job creation and job submission.

Job creation

This step consists of writing all the commands that you want executed into a file and then adding, at the beginning of the file, the Slurm submission directives for defining certain parameters such as:

  • Job name (directive #SBATCH --job-name=...)
  • Elapsed time limit for the entire job (directive #SBATCH --time=HH:MM:SS)
  • Number of compute nodes (directive #SBATCH --nodes=...)
  • Number of (MPI) processes per compute node (directive #SBATCH --ntasks-per-node=...)
  • Total number of (MPI) processes (directive #SBATCH --ntasks=...)
  • Number of OpenMP threads per process (directive #SBATCH --cpus-per-task=...)
  • Number of GPUs for jobs using GPUs (directive #SBATCH --gres=gpu:...)

Once the submission directives have been defined, it is recommended to enter the commands in the following order:

  • Go into the execution directory under WORK, SCRATCH or JOBSCRATCH (for more information, see our documentation about the disk spaces).
  • Copy the entry files necessary to the execution into this directory.
  • Launch the execution (via the srun command for the MPI, hybrid or multi-GPU codes).
  • If you have used the SCRATCH or the JOBSCRATCH, you should copy the result files which you wish to save.

Comments :

  • With the Slurm directive #SBATCH –cpus-per-task=..., which sets the number of threads per process, you may also set the quantity of memory available per process. For more information, please consult our documentation on the memory allocation of a CPU job and/or of a GPU job.
  • Detailed examples of jobs are available on our Web site in the sections entitled “Execution/commands of a CPU code” and “Execution/commands of a GPU code”.

Job submission

To submit a batch job (here, Slurm script mon_job), you must use the following command :

$ sbatch mon_job

Your job will be placed in a partition according to the values requested in the Slurm directives. We advise you to set the parameters concerning the number of CPUs/GPUs, and the elapsed time, as accurately as possible in order to have a job return as rapidly as possible.

Comments :

  • For monitoring and managing your batch jobs, you should use the Slurm commands.
  • In batch mode, the user cannot intervene during the execution of commands except to stop/kill the job. Consequently, file transfers must be done without using/requiring a password.
  • The compute nodes have no access to the Internet which prevents all downloading (Git repositories, Python/Conda installation, …) from these nodes. If needed, these downloads can be done from the front ends or from the pre-/post-processing nodes, either before the code execution or via the submission of cascade jobs.
  • If you want to execute a job from the pre-/post-processing machine (jean-zay-pp), you must use the Slurm directive shown below in the submission script:
    #SBATCH --partition=prepost

    If this submission directive is absent, the job will execute in the default partition, thus on the compute nodes.

For any problem, please contact the IDRIS User Support Team.

Return to Table of Contents


12. Training courses offered at IDRIS

IDRIS training courses

IDRIS provides training courses for its own users as well as to others who use scientific computing. Most of these courses are included in the CNRS continuing education catalogue CNRS Formation Entreprises which makes them accessible to all users of scientific calculation in both the academic and industrial sectors.

These courses are principally oriented towards the methods of parallel progamming: MPI, OpenMP and hybrid MPI/OpenMP, the keystones for using the supercomputers of today. Since 2021, courses concerning Artificial Intellignec are proposed.

A catalogue of scheduled IDRIS training courses, regularly updated, is available here: IDRIS trainings catalogue

These courses are free of charge if you are employed by either the CNRS (France's National Centre for Scientific Research) or a French university. In these cases, enrollment is done directly on the web site IDRIS courses. All other users should enroll through CNRS Formation Entreprises.

IDRIS training course materials are available on line here.

Return to Table of Contents


13. IDRIS documentation

  • The Web site : IDRIS maintains a regularly updated website www.idris.fr/eng/, grouping together the totality of our documentation (IDRIS news, machine functioning, etc.).
  • Manufacturer documentation : Access to complete manufacturer documentation concerning the compilers (f90, C and C++), the scientific libraries, message-passing libraries (MPI), etc.
  • The manuals : Access to all the Unix manuals with your user login on any of the IDRIS calculators by using the command man.

Return to Table of Contents


14. User support

Contacting the User Support Team

Please contact the User Support Team for any questions, problems or other information regarding the usage of the IDRIS machines. This assistance is jointly provided by the members of the HPC support team and AI support team.

The User Support Team may be contacted directly:

  • By telephone at +33 (0)1 69 35 85 55 or by e-mail at
  • Monday through Thursday, 9:00 a.m. - 6:00 p.m. and on Friday, 9:00 a.m. - 5:30 p.m., without interruption.

Note: During certain holiday periods (e.g. Christmas and summer breaks), the support team staffing hours may be reduced as follows:
Monday through Friday, 9:00 a.m. - 12:00 (noon) and 1:30 p.m. - 5:30 p.m.
Outside of regularly scheduled hours, the answering machine of the User Support Team phone will indicate the currently relevant staffing hours (normal or holiday).
Administrative management for IDRIS users
For any problems regarding passwords, account opening, access authorisation, or in sending us the forms for account opening or management, please send an e-mail to: