IDRIS: GENERAL INTRODUCTION



1. Introduction to IDRIS

The Missions and Objectives of IDRIS

The Institute for Development and Resources in Intensive Scientific Computing (IDRIS), founded in 1993, is a centre of excellence in intensive numerical calculations which serves the research branches of extreme computing. This concerns the application aspects (large-scale simulations) as much as the research inherent to high performance computation (calculation infrastructures, resolution methods and associated algorithms, processing of large data volumes, etc.).

IDRIS is the major centre of very high performance intensive numerical computation for the French National Centre for Scientific Research (CNRS).  Together with the two other national centres, CINES (the computing centre for the French Ministry of Higher Education and Research) and the Very Large Computing Center (TGCC) of the French Alternative Energies and Atomic Energy Commission (CEA), and coordinated by GENCI (Grand Équipement National de Calcul Intensif), IDRIS participates in the installation of national computer resources for the use of government-funded research which requires extreme computing means.

Management of Scientific Resources

Coordinated by GENCI (Grand Équipement National de Calcul Intensif), a call for proposals is launched during the fourth trimester of each year for the allocation of computing resources for the following year at the group of three national computing centres (CINES, IDRIS and TGCC).

Requests for resources are made with the DARI (Demande d'Attribution de Ressources Informatiques) form through a common web site for the group of computing centres (see Requesting resource hours on IDRIS machines).

The IDRIS User Committee

The role of the User Committee is to be a liaison for dialogue with IDRIS so that all the projects which received allocations of computer resources can be successfully conducted in the best possible conditions. The committee transmits the observations of all the users regarding the functionning of the centre and the issues are discussed with IDRIS in order to determine the appropriate changes to be made.

For more information, consult the website page correspondent.

IDRIS personnel

Organisation

IDRIS is organised into teams coordinated by the director in the following manner:

IDRIS Organisation chart

Return to Table of Contents


2. The IDRIS machines

HPC resources

CNRS/IDRIS Infrastructure informatique

Computer Number of cores Memory Peak performance
IBM Blue Gene/Q de l'IDRIS Turing: IBM Blue Gene/Q 98,304 96 TiB 1.258 Pflop/s
IBM x3750M4 de l'IDRIS Ada: IBM x3750M4 10,624 46 TiB 230 Tflop/s
Archives serveur         Local disk space Available space on
    tapes    
Serveur d'archives, solution IBM Ergon: IBM solution + Storagetek SL 85 000 2 Pio 10 Pio

Turing : Introduction

Turing is the massively parallel architecture of IDRIS. This is an IBM Blue Gene/Q machine which is particularly well-balanced (network, I/O, computing power and memory bandwidth) and has a remarkable energy efficiency (2.17 Gflops/W).

Hardware characteristics of Turing :

  • 6 racks, each containing 1,024 compute nodes with 16 cores per node, for a total of 98,304 cores
  • Cumulated peak performance of 1.258 Pflop/s (42nd in the TOP500 worldwide ranking of November 2014)
  • 96 TiB of total memory (1 GiB per execution core)
  • GPFS parallel file system (WORKDIR) shared by Ada, Adapp and Turing, with 50 GiB/s bandwidth in write and in read.

The codes which run on Turing must have a sufficiently high degree of parallelism (ideally, at least 512 or more execution cores) and not go beyond the 1 GiB memory constraint per core. In production, there can be up to 4 racks of accessible resources (or 65,536 execution cores). Specific classes (of 1 to 64 compute nodes) are available for the development or code porting phases. You may consult the batch limits of the class structures for Turing.

For more information concerning the available software or librairies, the architecture, and the usage of the Turing machine resources, see here.

Ada : Introduction

Ada is the IDRIS computer with the most wide-ranging usage. It is composed of large memory SMP nodes (IBM x3750-M4) interconnected by a high-speed InfiniBand network.

Hardware characteristics of Ada :

  • 332 x3750-M4 compute nodes; quadri-socket nodes consisting of 4 Intel Sandy Bridge E5-4650 8-core processors at 2.7 GHz, corresponding to 32 cores per node
    • 304 nodes with 128 GiB of memory (4 GiB/core)
    • 28 nodes with 256 GiB of memory (8 GiB/core)
  • Cumulated peak performance of 233 Tflop/s
  • InfiniBand FDR10 Mellanox network (2 links per node)
  • GPFS parallel file system (WORKDIR) shared by Ada, Adapp and Turing, with 50 GiB/s bandwidth in write and in read

The Ada machine is adapted to a large variety of codes which can necessitate a large amount of memory, from sequential application to multithreaded codes (OpenMP) or codes (MPI or hybrid) having an average degree of parallelism (from hundreds to a few thousand execution cores). In usage, the accessible resources can reach up to 2048 execution cores. You may consult the interactive and batch class limits for Ada at class structures.

For more details concerning the available software or librairies, the architecture or the usage of this machine's resources, see here.

Adapp: Introduction

Adapp is the architecture at IDRIS dedicated to intensive pre- and post-processing and to the management of great masses of data. Adapp must only be used for jobs with the following characteristics:

  • Performing many I/Os (e.g. re-combining files)
  • Requiring a large amount of memory (e.g. mesh generating and partitioning)

This pre-/post-processing Adapp architecture contains only 4 compute nodes, each one having:

  • Four 8-core Intel Westmere processors at 2,67GHz (or 32 cores)
  • 1 TB of memory
  • 8 internal disks of 600 GB

Consequently, the use of these nodes is subject to the following restrictions:

  • Maximum Elapsed time of 20h for both sequential and parallel jobs
  • Maximum reservation of 32 cores (or 1 compute node)
  • Maximum of 100 GB of memory for a sequential execution or 30 GB per reserved core in parallel classes (MPI, OpenMP or hybrid)

Interactive and batch limits may be consulted at class structures for Ada/Adapp.

Please note that you just have to add the keyword
# @ requirements = (Feature == “PREPOST”)
for your job to be submitted on Adapp. However, to be able to run, it must respect the limits indicated above (Elapsed time, number of cores and memory) or it will remain in waiting.

From Adapp (interactive or batch), the Ergon archive server HOME is directly accessible in read/write via the $ARCHIVE environment variable (thanks to a GPFS setup).

For more information concerning the available software or librairies, the architecture or the usage of the Adapp resources, see here.

Ergon: Introduction

Ergon is the archive server at IDRIS. It is a hierarchical backup system: file storage on disk (2 PB) followed by automatic migration to tapes (10 PB), archiving data for the medium and long term throughout the duration of a project.

Ergon is an IBM solution composed of 13 servers including:

  • 3 front-ends, each with 2 processors (Intel Xeon E5-2650) ⇒ 16 cores @2.6 GHz and 128 GB of memory.
  • 6 GPFS file servers (4 data servers and 2 metadata servers) associated with 2 GSS26 disk enclosures with a bandwidth of 12 GB/s and a total available capacity of 2 PB.
  • 2 TSM HSM servers to manage the robotics and migration of files to tapes.

Storage: The StorageTek SL8500 robot is equipped with 16 readers and contains 6300 tapes, offering a capacity of 10 PB. The maximum capacity of the robot is 10,000 tapes.

For more information concerning the architecture and utilisation of the Ergon resources, see here.

Return to Table of Contents


3. Requesting allocations of hours on IDRIS machines

Requesting resource hours at IDRIS

Requesting resource hours for IDRIS machines is done via the DARI site which is common to the three national computing centres, CINES, IDRIS and TGCC: www.edari.fr.

The campaign for the submission of requests for the A5 allocation is closed since Wednesday, 22 August 2018.

This A5 call for projects was for the following allocations:

  • The A5 hours allocation: Accessible to projects having submitted a request for the A3 allocation (or before) and to new projects. These hours are usable during one year, from beginning-November 2018 to end-October 2019.
  • The A4 complementary hours allocation: Accessible to projects having obtained hours in the A4 allocation. These hours will be usable during 6 months, from beginning-November 2018 to end-October 2019.

For more detailed information, see the GENCI explanatory document: Attribution d'heures de calcul.

In addition, throughout the year, you have the possibility of requesting the following resources (using the DARI site www.edari.fr):

  • Supplementary resources as needed (“demandes au fil de l'eau”) for existing projects which have used up their quota of hours during the year (outside of an allocation campaign).
  • Preparatory access (“accès préparatoire”) for projects which do not have an existing project at IDRIS or those which have a project on only one of the computing machines (Turing or Ada).
    After a request for preparatory access is approved, the resources accorded are 50 000 hours on Turing and/or 15 000 hours on Ada.
    Access is granted for a period of 6 months beginning at the account(s) opening.

Complete information for submitting requests for hours allocations is found on the DARI help page.

Return to Table of Contents


4. How to obtain an account at IDRIS

Account management: opening and closure of accounts

Opening a user account

After a project has been approved and obtained hours on one of the IDRIS computers, it is given an identifying “project number”.
Each user in the project must individually request the opening of his/her account: There is no automatic or implicit account opening.

For a new project

Each individual person wishing to open a project account at IDRIS must complete the GENCI Account Creation Request form, whether or not he/she already has an account at IDRIS: link (Choose English language version on the GENCI page.).

Attention: In application of the regulatory measures for the protection of the national scientific and technical potential (PPST), the creation of a new account may require a French ministerial (MESR) authorization. The decision about whether to pursue this authorization is made by the IDRIS Director or the CNRS Defence and Security Officer. Considering that the authorization processing period at the ministry may take up to two months, a personal communication will be transmitted to the project manager and the concerned user(s) in order to begin implementation of the required procedure.

For a project renewal

Accounts already existing on the previously accessible compute machines are automatically carried over from one allocation session (project call) to another, if the eligibility conditions of the project members have not changed (cf. GENCI explanatory document "Attribution d'heures de calcul", section 3. “Conditions d’éligibilité”).

  • If your account is open and the project is granted hours in the following project call on the same compute machines as in the preceding project call, no action on your part is necessary.
  • If the renewed project is granted hours on a compute machine to which the project was NOT given access during the preceding allocation, ALL the users of this project must request the opening of an account on this new machine by using the FGC form (section: “Extension of a user account to a supplementary machine”).
  • You may make necessary modifications on an existing account (addition/removal of IP addresses or changes in postal address, phone number, employer, etc.) by using the FGC form.

How to transmit forms to IDRIS

All forms must be transmitted to IDRIS by e-mail to the following address: .

Closure of a user account

An account can only exist as “open” or “closed”:

  • Open. In this case, it is possible to:
    • Submit pre- and post-processing jobs on Adapp.
    • Submit bonus jobs on the compute machines if the project still has bonus hours remaining (cf. the last lines of the 'cpt' command output).
    • Submit jobs on the compute machines if the project's current hours allocation has not been exhausted (cf. 'cpt' command output).
  • Closed. In this case, the user can no longer connect. An e-mail notification is sent to the project manager and to the user at the time of account closure.

Attention: After an account is closed, the files can be deleted at any time by IDRIS.

Account closure per request of the project manager

A project manager may use the FGC form to request an account closure.
It is the responsibility of the project manager to close the account of a project member who is no longer working on the subject matter of the project or who, having changed employer, is no longer paid by a French research organisation (cf. GENCI Attribution d'heures de calcul, section 3: “Conditions d’éligibilité”).

ATTENTION :

  • The request for an account closure concerns all the machines for which this account was opened.
  • Account closure will result in the destruction of all the account files at the initiative of IDRIS after an undefined time delay.
  • Only the project manager may request the copying of these files to another account (to be requested via the FGC form). It is indispensable, in this case, to verify that the available space is sufficient for the copied files (project disk quotas on the servers).
  • Whether the files are copied to another account or not, only the project manager may request that the account files be immediately destroyed during the account closure in order to, for example, liberate the disk quotas (request “immediate purge” on the FGC form).

Account closure by IDRIS

Account closure of an unrenewed project

When a GENCI project is not renewed, the following procedure is applied (regardless of the project's lifespan):

  • On the date of project expiry, DARI and bonus hours are no longer available and the project accounts can no longer submit jobs on the compute machines (except for Adapp). However, the project accounts remain open for a period of six months.
  • After this six-month delay period, all the project accounts on all the servers are closed.

File recovery, by transferring files to a local laboratory machine, is the responsibility of each user during the six months following the end of an unrenewed project.

This six-month delay period allows avoiding the premature closing of project accounts: This is the case for a project of allocation Ai which was not renewed for the following year (allocation Ai+2) due to not needing more calculation hours at the end of period Ai but which is renewed for the allocation Ai+3 (which begins at the end of the six-month period).

Account closure following expiry of the ministerial access authorisation for IDRIS computer resources

The first notification of impending expiry is sent out 90 days before the expiry date and a second, 70 days before the expiry; the account is then closed on the expiry date. To avoid this account closure, the user is advised to submit to IDRIS a new GENCI Account creation request form link (Choose English version on the GENCI page) as soon as receiving the first notification so that IDRIS may begin processing an access prolongation request.

Account closure for security reasons

An account may be closed at any moment by decision of the IDRIS management.

Declaring the machines from which a user connects to IDRIS

Each machine from which a user wishes to access an IDRIS computer must be registered at IDRIS.

The user must provide, for each of his/her accounts, a list of machines which will be used to connect to the IDRIS computers (the machine's name and IP address). This is done at the creation of each account via the GENCI account creation request form available at this portal.

The user must update the list of machines associated with a login account (adding/deleting) by using the FGC form (account administration form). After completing this form, it must be signed by both the user and the security manager of the laboratory.

Important note: Personal IP addresses are not authorised for connection to IDRIS machines.

Security manager of the laboratory

The laboratory security manager is the network/security intermediary for IDRIS. This person must guarantee that the machine from which the user connects to IDRIS conforms to the most recent rules and practices concerning information security and must be able to immediately close the user access to IDRIS in case of a security alert.

The security manager's name and contact information are transmitted to IDRIS by the laboratory director on the FGC (account administration) form. This form is also used for informing IDRIS of any change in the security manager.

Obtaining temporary access to IDRIS machines from a foreign country

The user on mission must request machine authorisation by completing the corresponding box on page 3 of the FGC form. A temporary ssh access to all the IDRIS machines is then accorded.

Return to Table of Contents


5. How to connect to an IDRIS machine

How to connect to IDRIS : Overview

In order to connect to IDRIS machines, the remote machine must be registered in the IDRIS network filters. If this is not the case, consult the machine registration procedure here.

The Ada, Adapp, Turing and Ergon machines are accessed using the same unique password per user.

Interactive access to IDRIS machines is only possible via the ssh protocol.

Before connecting, we advise you to consult the page about passwords here.

Connecting to IDRIS with SSH and password

Connecting to any of the IDRIS machines from a local machine

You can connect to any of the IDRIS machines using SSH. The corresponding comand is:

$ ssh machine-idris.idris.fr -l rlabxxx 

To connect to the front-ends (Adapp, Ada and Turing), you can open an ssh tunnel authorising the X11 connections:

$ ssh -X adapp.idris.fr -l rlabxxx

$ ssh -X ada.idris.fr -l rlabxxx
  
$ ssh -X turing.idris.fr -l rlabxxx

Passwords

Connecting to an IDRIS machine is done with a user login and an associated password. Ada, Adapp, Turing and Ergon all use one and the same password per login account.

During the first connection, the user must indicate the “initial password” and then immediately change it to an “actual password”.

The initial password

What is the initial password?

The initial password is the result of the concatenation of two passwords (respecting the order):

  1. The first part consists of a randomly generated password from IDRIS which is sent to you by e-mail during the account opening or during a reinitialisation of your password. It remains valid for 20 days.
  2. The second part consists of the user-chosen password (8 alphanumeric characters) which you provided on the “Account creation request form (GENCI)” during your first account opening request (if you are a new user) or when requesting a change in your initial password (using the FGC form).
    Note: For a user with a previously opened login account created in 2014 or before, the password indicated in the last postal letter from IDRIS should be used.

The initial password must be changed within 20 days following transmission of the randomly generated password.

Changing the initial password to an “actual password” (which is entirely created by you) is triggered automatically during your first connection to an IDRIS machine (see below: "Example of using and changing an initial password during the first connection"). If this first connexion is not done within the 20-day timeframe, the initial password is invalidated and an e-mail is sent to inform you. In this case, you just have to send an e-mail to to request a new randomly generated password which is then sent to you by e-mail.

An initial password is generated (or re-generated) in the following cases:

Account opening (or reopening)

An initial password is formed at the creation of each account and also for the reopening of a closed account.

Loss of the actual password

  • If you have lost your actual password, you must contact to request the re-generation of a randomly generated password which is then sent to you by e-mail. You will also need to have the user-chosen part of the password you previously provided in the FGC form.
  • If you have also lost the user-chosen part of the password which you previously provided in the FGC form (or was contained in the postal letter from IDRIS in the former procedure of 2014 or before), you must complete the “Request to change the user part of initial password” section of the FGC form, print and sign it, then scan and e-mail it to or send it to IDRIS by postal mail. You will then receive an e-mail containing a new randomly generated password.

Example of changing an initial password to an actual password during the first connection

You would like to use your initial password (and you do not have the ssh key to connect) and then change it to an “actual password”. Below is an example of the first connection and creating the “actual password” for the login1 account on Adapp.

Recommendation : Before beginning the procedure, carefully prepare the new password which you will enter (see Creation rules for "actual passwords" in section below).
Important note: The initial password will be requested two times.

$ ssh login1@adapp                                                         
login1@adapp's password:                ##Enter your INITIAL PASSWORD##
Last login: Fri Nov 28 10:20:22 2014 from machine.idris.fr
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for user login1.
Enter login(    ) password:              ##Re-enter your INITIAL PASSWORD##    
Enter new password:                      ##Enter your new password##
Retype new password:                     ##Re-enter your new password##
     password information changed for login1
passwd: all authentication tokens updated successfully.
Connection to adapp closed.
$ 

You will be immediately disconnected after entering a correct actual password (“all authentication tokens updated successfully”). You may now re-connect using your new actual password.

The actual password

Once your actual password has been created and entered correctly, it will remain valid for one year (365 days).

How to change your actual password

As each login account has its own unique password, you can change your password at any time by using the UNIX command passwd directly on any of the computers. The change is taken into account immediately for all the machines. This new actual password will remain valid for one year following its creation.

Creation rules for "actual passwords"

  • It must contain a minimum of 12 characters.
  • The characters must belong to at least 3 of the 4 following groups:
    • Uppercase letters
    • Lowercase letters
    • Numbers
    • Special characters
  • The same character may not be repeated more than 2 times consecutively.
  • A password must not be composed of words from dictionaries or from trivial combinations (1234, azerty, …).

Note:

  • Your actual password is not modifiable on the same day as its creation or for the 5 days following its creation. Nevertheless, if necessary, you may contact the User Support Team to request a new randomly generated password for the re-creation of an initial password.
  • A record is kept of the last 6 passwords used. Reusing one of the last 6 passwords will be rejected.

Password expiry

If, despite the warning e-mails sent to you, you have not changed your actual password before its expiry date (i.e. one year after its last creation), your password will be invalidated. As in the case of losing an actual password, you must contact to request the re-generation of a randomly generated password which is then sent to you by e-mail. You will also need to have the user-chosen part of the password you previously provided in the FGC form.

Account blockage following 15 unsuccessful connection attempts

If your account has been blocked as a result of 15 unsuccessful connection attempts, you must contact the IDRIS User Support Team.

Account security reminder

You must never write out your password in an e-mail sent to IDRIS (User Support, Gestutil, etc.) no matter what the reason: We would be obligated to immediately generate a new initial password, the objective being to inhibit the actual password which you published and to ensure that you define a new one during your next connection.

Each account is strictly personal. Discovery of account access by an unauthorised person will cause immediate protective measures to be taken by IDRIS including the eventual blockage of the account.
The user must take certain basic common sense precautions:

  • Inform IDRIS immediately of any attempted trespassing on your account.
  • Respect the recommendations for using SSH keys.
  • Protect your files by limiting UNIX access rights.
  • Do not use a password which is too simple.
  • Protect your personal work station.

Return to Table of Contents


6. Management of your account and your environment variables

Managing your user account

How do I modify my personal contact information ?

Modification of your personal contact information is done on the IDRIS extranet site.

  • If you do not have an Extranet password or have lost it, the access procedures to Extranet are explained on this page.
  • If you have your Extranet password, connect with your identifiers, click on Extranet, then: Votre compte (Your account) ⇒ Vos données (Your information) ⇒ Coordonnées (Contact information).

The only information which can be modified on Extranet is your:

  • e-mail address
  • telephone number
  • fax number

To modify your postal address, you must send us a completed FGC form.

How do I consult my consumption of hours ?

We advise you to use the Extranet interface. Complete information can be found on this page.

What should I do when all my computing hours will soon be consumed?

To allow the continuation of projects which have consumed the volume of hours allocated to them, it is possible throughout the year, to make a request for “supplementary resources”. For more information, consult this page.

Is it possible to know machine availability?

You may obtain this information by connecting on Adapp. The scheduled times for the machines to be stopped and their availability in real time are displayed at the top of the screen just after the connection.

If there is an important incident, a message will also be displayed on an IDRIS Web site page which is available without a password.

How do I recover files which I accidentally deleted?

In order to recovery deleted files, consult the file recovery procedure which uses the TINA software. This procedure (illusrated for Adapp) is the same on Adapp, Turing and Ada. Therefore, you simply follow the file recovery procedure on the machine where you lost or deleted your files.

Is it possible to ask IDRIS to transfer files from one user account to another user account?

Only the project manager may make this request. The request must be made by a signed fax or postal mail and must indicate:

  • The name(s) of the concerned machine(s).
  • The two login accounts (the source login and the destination login).
  • The list of files and/or directories to be transferred.
  • Following this, the project manager may request closure of the source login account. See Account management: opening and closure of accounts for more information about account closure.

Recuperation of files on an external support

It is no longer possible to recover files on an external support (i.e. portable disk drives).

Turing, Ada, Adapp : Access and shells

Management of the shell environment

What shells are available on the IDRIS machines?

The Bourne-Again Shell (bash) and the TC Shell (tcsh) are the two command interpreters installed on the IDRIS machines. The bash shell is the default setting. The Bourne-Again Shell (bash) is an important evolution of the former Bourne shell (sh) and provides advanced functionalities. Therefore, we highly recommend using bash on the IDRIS machines.

What environment files are executed during the launching of a login session in bash?

The preferred environment file is .bash_profile and it must be found in your HOME. If not, the .profile file may be used if it is in your HOME. This file is automatically executed at the login and only one time per session. The user must define the environment variables and personal programs in this file. Any aliases and user-defined shell functions should be put in the .bashrc file which is run at the launching of each non-login sub-shell.

It is preferable to use only one environment file, either .bash_profile or .profile.

Important note: Overwriting the PATH variable inevitably creates major problems. For this reason, it is always advised to keep the PATH provided by the machine. If you wish to add a research directory for the execution of local commands during all your future sessions, you must procede in the following manner in your .bash_profile or .profile file:

 export PATH=$PATH: directory to add 

How to define a user-friendly environment in bash

The Bash shell proposes two edition modes, to be chosen according to your preference, available by using the set command:

set -o vi # to be in vi mode
set -o emacs # to be in emacs mode

This command should be placed in your environment file, .bash_profile or .profile.

You may then place yourself on the command line as if you were using your favourite editor. To place yourself on the first character of the command, type Ctrl-a in emacs mode (or Esc-0 in vi mode).

To re-edit the last command which you have launched, it is the same thing as going back up a line in your editor: Ctrl-P in emacs mode, Esc-k in vi mode, and so on. (See man bash for more information about the possibilities of these two modes.)

Keep in mind that you may use the filename completion function to avoid having to type the entire filename. If the file is present in the directory, you just need to type the first letters of the name, then Esc-Esc if in emacs mode or Esc- if in vi mode.

emacs : Esc-Esc,
vi : Esc-

Return to Table of Contents


7. The disk spaces

Disk Quotas

Introduction

Group quotas guarantee equitable access to disk resources. This prevents the situation where one group of users consumes all of the disk space and prevents other groups from working. At IDRIS, the group quotas limit both the quantity of disk space and the number of files (in inodes) for each project. There is no limit imposed for each individual user of a group.

When one of the group quotas is reached, no more files can be created. Doing this could disturb the jobs being run by your group. Important: Editing a file when the quantity of disk space has reached its limits can bring the file size back to zero, thereby deleting the file contents.

How to know your current disk space consumption and quotas

The quota_u command allows obtaining the quotas and current consumption for the $HOME space. The -w option allows obtaining the same information for the $WORKDIR. The $TMPDIR space is not subjected to quotas.

  • The quota_u command displays the consumption for each user of the group. This consumption includes all the files belonging to a user even if they are not located in his/her personal spaces: for example, files created in the group shared spaces ($COMMONDIR) or located in personal spaces of other users. This consumption information is given to all the group members, and only to them, in order to allow them to inform a colleague of an apparent problem and also to permit the project manager to monitor the group.
  • Information from the quota_u command is not updated in real time: The date and hour of the update are displayed in the output header of the quota_u command.

When quotas are exceeded

No e-mail warning is sent to the group when it has exceeded the quota limit. However, you will be informed by an error message when you manipulate files in the concerned disk space (“disk quota exceeded”).

When you are blocked or becoming blocked :

  • Try to eliminate some files (you or the other members of your group).
  • Transfer your files into another space such as the Ada/Adapp or Turing $WORKDIR or into the HOME of the Ergon archive server.
  • The project manager (or designated replacement) may also make a request for a quota increase (with justifications) via the IDRIS Extranet.

Ergon: The HOME disk space

The HOME

The Ergon HOME space is intended for storing data files. It consists of a hard disk cache with progressive migration of files onto tapes. These files have a lifespan of one year after their creation date or their last access date.

  • This is a permanent space; the maximum size of a file is 500 GB.
  • It is directly accessible via ssh. It is also accessible from the compute machines by using the ''mfget/mfput'' commands, and from the pre/post-processing machine Adapp via the $ARCHIVE environment variable.
  • It is subject to group quotas (limiting both the disk space and the number of files, per group). These quotas may be consulted by using the IDRIS command quota_u.
  • The du (disk usage) Unix command on Ergon needs to be used with the --apparent-size option in order to take into account the size of the files migrated to tapes.
  • The Ergon HOME space is not backed up. However, it is possible to use the mfdupli command to duplicate files with the assurance that the copies will be migrated to distinct tapes. The copies are then contained in the DUPLI subdirectory of your Ergon HOME.

Ada, Adapp, Turing: Disk spaces

Three distinct disk spaces (HOME, WORKDIR and TMPDIR) are accessible to users on the Ada, Adapp and Turing computers. A fourth disk space, the ARCHIVE space, is only accessible on Adapp. Each space has specific characteristics adapted to its usage which are described on this page. The paths to access these spaces are stocked in 4 variables of the shell environment: $HOME, $WORKDIR, $TMPDIR et $ARCHIVE.

The HOME

$HOME : This is the home directory during an interactive connection. This space is intended for frequently-used small-sized files such as the shell environment files, the tools, and potentially the sources and libraries of limited size (in space and in number of files). The characteristics of the HOME are:

  • A permanent space.
  • Backed up daily by the TiNa software application.
  • Shared by the Adapp and Ada machines.
  • Accessible in interactive or batch jobs.
  • It is the user's home directory when beginning an interactive connection. It can also be accessed through the $HOME variable:

    $ cd $HOME
  • Submitted to group quotas which are intentionally rather low : 4GB by default. The IDRIS command quota_u allows you to see the real situation of your disk occupation and that of each of your group members.

  • Intended to receive small-sized files, the block size is 256 KiB (262 KB) (command stat -f $HOME).

The WORKDIR

$WORKDIR : This is a permanent work and storage space which is usable in batch. In this space, we generally store large-sized files which are used to run batch jobs: data files, executable files, result or restart files, submission scripts and very large source files. The characteristics of WORKDIR are:

  • A permanent space.
  • Not backed up.
  • Common to the 3 machines: Adapp, Ada and Turing.
  • Accessible in interactive or in batch jobs.
  • Composed of 2 sections:
    • A section in which each user has an individual part; it is accessed with the command:

      $ cd $WORKDIR
    • A section common to the UNIX group to which the user belongs. The files to be shared by all the group members can be placed here; it is accessed with the command:

      $ cd $COMMONDIR
  • Submitted to group quotas : 1 TiB (1.1 TB), by default. The IDRIS command quota_u -w allows you to see the real situation of your disk occupation and that of each of your group members.
  • The total WORKDIR size is 932 TiB (1024 TB).
  • Intended to receive large-sized files, the block size is 4 MiB (4.2 MB) (command stat -f $WORKDIR).
  • The WORKDIR is a GPFS disk space for which the bandwidth (about 50 GiB/s in read/write) is shared between the Adapp, Ada and Turing machines. It can occasionally be saturated because of exceptionally intensive usage.

Usage recommendations:

  • Because the WORKDIR is not backed up, the files are not protected from the risk of accidental manual destruction (rm) or a disk failure. Therefore, it is necessary to regularly save the sensitive or important files in the Ergon archive server.

Attention :

  • Since batch jobs can run in the WORKDIR, the files are directly accessible in read/write (permanent space) and do not need to be explicitly copied. However, because several of your jobs can be run at the same time, you must create a unique execution directory for each of your jobs. In addition, the disk space is submitted to group quotas and your job execution can suddenly stop if the quotas are reached. Therefore, you must be aware of both your own activity in this disk space and that of your colleagues. For these reasons, you may prefer running your batch jobs in the TMPDIR.

The TMPDIR

$TMPDIR : An execution directory for batch jobs. The following are the characteristics of the TMPDIR:

  • TMPDIR is a temporary directory.
  • It is only accessible in batch jobs by using the $TMPDIR variable.
  • It is automatically created when a batch job begins and is, therefore, unique to each batch job.
  • It is automatically destroyed at the end of this job: You must, therefore, copy the important files on a permanent disk space (WORKDIR, for example) before the end of the job.
  • TMPDIR is not submitted to group quotas, as is HOME or WORKDIR. However, some security quotas are put in place to avoid the situation where a user could unintentionally completely fill up all of the disk space because of an accidental usage error.
  • Total TMPDIR size is 466 TiB (512,3 TB).
  • Intended to receive large-sized files, the block size is identical to that of the WORKDIR: 4 MiB (4,2 MB) (command stat -f $TMPDIR only works in batch).

Usage recommendations:

Examples of batch jobs using the TMPDIR can be found in the “Code execution/control” documentation (Ada here and Turing here). General advice for using the TMPDIR:

  • For each execution, we assume that the input files necessary for the execution (restart or executable) have previously been stored on a permanent file system (HOME ou WORKDIR). If this is not the case, the first step is to use the archive class and the principle of multi-step jobs (example on Ada here and on Turing here) to copy the necessary files into the WORKDIR by using the command mfget.
  • At the beginning of each batch job execution, we advise you to be in the TMPDIR.
  • Copy the necessary files from WORKDIR into TMPDIR using the command cp.
  • Launch the execution in the TMPDIR.
  • Before the batch job finishes, as a last step, you must:
    • Backup save the significant files (if they are used regularly or might be post-processed) in a permanent file system (HOME or WORKDIR) by using the command cp.
    • Archive files for a longer time period by saving them on the Ergon archive server, using the command mfput in the archive class of your batch job (example for Ada here or for Turing here).

Comments:

  • As the performance in read/write is identical for WORKDIR and TMPDIR, you can avoid making copies between these two directories if the code is able to read or write the files directly into the HOME or the WORKDIR;
  • TMPDIR, like WORKDIR, is a GPFS disk space whose bandwidth (about 50 GiB/s in read/write) is shared by the Adapp, Ada and Turing machines. In consequence, the input/output performance can vary as it can be slowed down in the case of exceptionally high-volume usage.

The ARCHIVE space (from Adapp only)

The ARCHIVE environment variable on Adapp is a link to the HOME of the Ergon archive server. This space is directly accessible in read/write from Adapp, the pre- and post-processing machine, through the intermediary of a GPFS mount.

$ ls -l $ARCHIVE

It is only accessible from the Adapp nodes (interactive or batch). It is not accessible from the Ada or Turing nodes.

HOME and WORKDIR: Data security

To improve the protection of your data stored in your HOME and WORKDIR spaces, we recommend that you respect the security policy put in place at IDRIS.

Summary of disk space characteristics

Note: The last column, entitled ARCHIVE, only concerns Adapp and Ergon.

HOME WORKDIR TMPDIR ARCHIVE
Life span permanent permanent duration of batch job permanent
Shared spaces common to Adapp and Ada (HOME is local on Turing) common to Adapp, Ada and Turing $ARCHIVE on Adapp, $HOME on Ergon
Backup saved yes no no no
Automatic deletion no no yes, at end of batch job no
Access in interactive yes yes no ($TMPDIR is not defined) yes from Adapp
Access in batch yes yes yes yes from Adapp
Block size 256 KiB (262 KB) 4 Mib (4.2 MB) 4 MiB (4.2 MB) 16 MiB (16.8 MB)
Group quotas quota_u : 5 GiB (5.4 GB) by default quota_u -w 1 TiB (1,1 To) by default none quota_u : 1 TiB (1.1 TB) on Ergon

Ergon: The disk space quotas, the quota_u command

The general principle of the disk space quotas is explained here. The specifics of the Ergon quotas are given below.

On Ergon :

  • The $HOME storage space is submitted to a group quota both in the number of files (15000) and in the volume: 1 TiB (1.1 TB), by default.
  • There is a “grace period” of up to 14 days maximum for the group to return within the quota limits; the number of days depends on the extent you have surpassed the quota.
  • It is possible on Ergon to exceed the quota by a small amount in order to prevent blocking an mfput command from a run on a computing machine, but only during the grace period.

The quota_u command

  • The Ergon quota_u command allows you to obtain the current disk space consumption of your group (in number of inodes and in volume) as well the quota limits for both. It is not updated in real time but is updated three times per day.
  • If a quota has been surpassed, the grace period (in number of days) is indicated in the Timeleft column.
  • The Ergon quota_u command is based on the du (disk usage) Unix command used with the --apparent-size option in order to take into account the files migrated to tapes (and, therefore, not present on disk).

Ergon: Commands available

On the Ergon archive server

  • mfdupli: secure duplication of a file or directory
  • mfret : modification of file expiry dates
  • quota_u : consumption and quota limits (volume and number of inodes)
  • bbftp : transfer files to a machine external to IDRIS
  • mfdods : dods_ls, dods_cp, dods_rm

On Adapp

  • mfls : list of contents of the Ergon HOME
  • mfget : transfer files from the Ergon archive server to Adapp.
  • mfput : transfer files from Adapp to the Ergon archive server.
  • mfdupli : secure duplication of an Ergon file or directory.
  • mfret : modification of expiry dates for files in the Ergon HOME.

On the Ada and Turing machines

  • mfls : list of contents for the Ergon HOME space.
  • mfget : transfer files from the Ergon archive server to Ada or Turing.
  • mfput : transfer files from Ada or Turing to the Ergon archive server.

Return to Table of Contents


8. Commands for file transfers

Ergon : The mfget and mfput transfer commands

The mfget and mfput commands were developed by IDRIS to secure file transfers between the IDRIS computing machines and the Ergon archive server and to optimise the transfer speed. The following examples shows the basic usage for login1 from a computing server: machine_i.

  • In interactive :
    • Copy the fic_ergon file from Ergon to the server machine_i under the name fic_calcul :
      $ mfget fic_ergon fic_calcul
    • Copy the file fic_calcul from the server machine_i to Ergon under the name fic_ergon :
      $ mfput fic_calcul fic_ergon
  • In batch, it is recommended to use the archive class to transfer files between the computing machines and the Ergon archive server.
    • Example (in batch) of using the mfget command in the archive class:
 
...           
# @ job_type = serial
# @ class = archive
...       
# Copy the d1 file from the Ergon $HOME to the WORKDIR of the computing machine:       
mfget d1 $WORKDIR        

# Copy the data2 file from the Ergon $HOME to the $WORKDIR of the computing machine under the name d2:       
mfget data2 $WORKDIR/d2        

# Copy the data3 file from the Ergon $HOME subdirectory REP
# to the $WORKDIR of the computing machine under the name d3:       
mfget REP/data3 $WORKDIR/d3        
  • Example (in batch) of using the mfput command in the archive class:
...
# @ job_type = serial
# @ class = archive
... 

cd $WORKDIR       

# Copy the r1 file from the computing machine to the Ergon $HOME:
mfput r1 r1
        
# Copy the file r2 from the computing machine to the Ergon $HOME subdirectory REP under the name result2:       
mfput r2 REP/result2        
...

To obtain more information about the file transfer options and capacities of these commands, we advise you to consult the manuals available on line from the computing servers (man mfget, man mfput).

File transfers using the bbftp command

To transfer large-sized files from IDRIS to your laboratory, we advise you to use BBFTP which is a software optimised for transferring files.

All the information for using the bbftp command is found HERE.

Return to Table of Contents


9. The module command

Ada, Adapp: User instructions for the module command

Introduction

The module command is there to respond the most effectively to your specific needs. That is, to use the compilers, mathematical libraries or tools without you having to look for them on the disks. At your request, it enhances the environment variables in function of the product which you wish to use (compiler, debugger, etc.).

Syntax

module (avail [product] | load product[/version] | list | switch product/version1 product/version2 | display product[/version] …)

  • ''avail'': lists the available products and their versions
  • ''load'': loads the product in its default version (noted: default), if no version is specified
  • ''list'': lists the loaded products and their versions
  • ''switch'': changes the version of a product already loaded

product represents the choice of:

  • a work mode
  • a compilor
  • a library
  • an application or a tool

version represents the different evolutions of the same product and can include:

  • default: version by default which is taken if you do not specify a version. The default version is generally the best adapted.
  • numéro: complete number of the version, usually in format X.Y.Z. 

List of available products

Are you looking for a specific product, such as a particular version of a library, and need to know if we have it for your use? To find the response to this question, the command to execute is module avail:

  $ module avail

  --------------------------- /smplocal/pub/Modules/IDRIS/modulefiles/environnement ---------------------------
  compilerwrappers/no           modules/1.147
  compilerwrappers/yes(default) modules/3.2.10(default)
  
  --------------------------- /smplocal/pub/Modules/IDRIS/modulefiles/bibliotheques ---------------------------
  arpack/96(default)               lapack/10.3.6(default)           petsc/real-hypre/3.3-p5
  blas/10.3.6(default)             lapack95/10.3.6(default)         petsc/real-mumps/3.1-p8(default)
  blas95/10.3.6(default)           mumps/4.10.0(default)            petsc/real-mumps/3.3-p5
  cmor/2.8.1(default)              nag/23(default)                  phdf5/1.8.9(default)
  fftw/2.1.5                       ncar/6.1(default)                pnetcdf/1.1.1
  fftw/3.2.2                       netcdf/4.1.3                     pnetcdf/1.3.1(default)
  fftw/3.3.2(default)              netcdf/mpi/4.1.3                 scalapack/10.3.6(default)
  fftw/3.3.3                       netcdf/seq/4.1.3(default)        scotch/6.0.0(default)
  hdf5/1.8.9(default)              p3dfft/2.5.1(default)            udunits/2.1.24(default)
  hdf5/mpi/1.8.9                   parmetis/3.2.0(default)          uuid/1.6.2(default)
  hdf5/seq/1.8.9                   parpack/96(default)              vtk/5.10.1(default)
  hypre/2.9.0b(default)            petsc/3.1-p8
  
  --------------------------- /smplocal/pub/Modules/IDRIS/modulefiles/applications ----------------------------
  abinit/7.0.5(default)        espresso/4.3.2               namd/2.8
  adf/2012.01c(default)        espresso/5.0.1(default)      namd/2.9(default)
  adf/2013.01.r36703           espresso/5.0.2               nwchem/6.1.1(default)
  avbp/avbp(default)           gaussian/g03_D02             siesta/2.0.2
  cdo/1.5.9(default)           gaussian/g09_A02(default)    siesta/3.1(default)
  cp2k/2.3_12343(default)      gromacs/4.5.5(default)       siesta/3.1-pl20
  cp2k/2.4_12578               gromacsplumed/4.5.5(default) vasp/4.6.35
  cpmd/3.13.2                  lammps/2012.10.10(default)   vasp/5.2.12(default)
  cpmd/3.15.3(default)         molcas/7.8(default)          vasp/5.2.2
  espresso/4.2.1               molpro/2010.1(default)
  
  ------------------------------ /smplocal/pub/Modules/IDRIS/modulefiles/outils -------------------------------
  cmake/2.8.10.2(default)   idl/8.2(default)          papi/5.1.0.2(default)     python/3.3.0
  ferret/6.84(default)      nco/4.2.3(default)        paraview/3.98(default)    totalview/8.11.0(default)
  fpmpi2/2.2(default)       ncview/2.1.2(default)     python/2.7.3(default)     xmgrace/5.1.23(default)

This list is permanently evolving. (The above list is dated April 2013.)

For more information, consult the user instructions for the module command in our website pages for the Turing and Ada machines: The module command on Turing or The module command on Ada.

Return to Table of Contents


10. How to submit a calculation

Interactive and batch

You have two possible ways of working: in interactive and in batch.

You need to respect the maximum limits for each of these 2 modes for the elapsed (or clock) time and the memory. These limits were set by IDRIS in order to optimally manage the computer resources. You will find further information concerning these limits by typing the news class command on the machine in which you are interested or by consulting the pages on our website concerning the machine(s) you are using: Turing class structure or class structure for Ada and Adapp

Working in interactive

In general, interactive is used for file management (creation, copies, archiving, back-ups, compilation, …). One of the first things you will do is edit a program source and then compile and execute it. All these operations can be carried out directly on the compute machines by using the designated commands for each of the machines. All the local machines which are registered in our filters can access any of the front-ends directly (on Turing, Ada, and Adapp) with the ssh command.

In interactive:

  • You have access to the front-ends.
  • You do not have access to the compute nodes.

Of course, interactive sessions will also be used to prepare batch jobs.

Working in batch

There are several reasons to work in batch mode:

  • The limits in batch for elapsed/clock time and memory are much higher than in interactive.
  • After submitting your batch job, it is still possible to close your interactive session without any risk to the batch job execution.
  • The resources are well managed as batch jobs are distributed on the machines in function of the resources requested (a job requiring high consumption of resources will be run during off-peak hours such as at night and weekends).
  • Batch processing is also used for your jobs on the Adapp pre-/post-processing machine.
  • Batch is the only mode used on Turing.

At IDRIS, we use LoadLeveler software to manage batch job scheduling on the IBM compute machines (Ada, Turing) and on the pre-/post-processing machine (Adapp). LoadLeveler schedules the jobs according to the amount of resources requested (memory, elapsed/clock time, files) and the number of active jobs at a given moment (the number of jobs for each user and the total number of jobs).

There are 2 basic steps for working in batch: creation and submission.

Creation: Consists of writing a file with the submission directives (options) and all the commands which you want to execute. You must begin wih the submission directives, such as:

  • Job name
  • Elapsed/clock time limit for the totality of the job
  • Maximum limit for the memory occupied for each job process
  • Number of processes (for MPI and/or OpenMP)

After the submission directives are entered, it is recommended to enter the commands in the following order:

  • Go into the TMPDIR (cd $TMPDIR).
  • Copy the input files which are necessary for the execution into the TMPDIR (from the HOME or the WORKDIR) by using the cp command.
  • Launch the execution.
  • Copy the result files which you wish to save into the HOME or the WORDIR (from the TMPDIR) by using the cp command.

Submission: To submit a job (or script), you must use the following command. (To have more details about the command, you may consult the manual on the machines via the man command.)

$llsubmit mon-job

Your job will be placed in a batch class according to the values written in the submission directives (see news class on the machine which interests you). We advise you to set the parameters as accurately as possible (concerning the elapsed/clock time and memory) both to avoid reserving resources which will remain unused and to receive the job results as rapidly as possible.

Comments :

  • Batch mode does not allow the user to intervene during the execution of the job commands. Therefore, file transfers must be carried out without needing to enter a password. (You can cancel the job execution only.)
  • Batch classes on Ada and Adapp share the same scheduler. If you want to run a job on the pre-/post-processing machine (Adapp), you must enter the following LoadLeveler keyword in the submission directives. (If this line is not in the submission directives, the job will be run on Ada.)
# @ requirements = (Feature == "prepost")


Detailed information is available on the IDRIS website www.idris.fr/eng for each of the IDRIS machines (Turing, Ada, Adapp) in the section called “Documentation” (code execution/control, compilation, etc.).

If you need further assistance, please contact the IDIRS User Support Team.

Return to Table of Contents


11. Training courses offered at IDRIS

IDRIS training courses

IDRIS provides training courses for its own users as well as to others who use scientific computing. Most of these courses are included in the CNRS continuing education catalogue CNRS Formation Entreprises, which makes them accessible to all users of scientific calculation both in the academic and industrial sectors.

These courses are principally oriented towards the methods of parallel progamming: MPI, OpenMP and hybrid MPI/OpenMP, the keystones for using the supercomputers of today. Courses are also given on the Fortran and C general scientific programming languages.

A new training course was introduced in 2014 concerning large-scale debugging (detection of progamming errors and fine-tuning of applications running on a large number of compute cores).

A catalogue of scheduled IDRIS training courses, regularly updated, is available on our web server: https://cours.idris.fr

These courses are free of charge if you are employed by either the CNRS (France's National Centre for Scientific Research) or the French national education system. In these cases, enrollment is done directly on the web site IDRIS courses. All other users should enroll through CNRS Formation Entreprises.

IDRIS training course materials are available on line here.

Return to Table of Contents


12. IDRIS documentation

  • The Web site : IDRIS maintains a regularly updated website www.idris.fr/eng/, grouping together the totality of our documentation (IDRIS news, machine functioning, etc.).
  • Manufacturer documentation : Access to complete manufacturer documentation concerning the compilers (f90, C and C++), the scientific libraries, message-passing libraries (MPI), etc.
  • The manuals : Access to all the Unix manuals with your user login on any of the IDRIS calculators by using the command man.

Return to Table of Contents


13. User support

Contacting the User Support Team

Please contact the User Support Team for any questions, information or for any problems you may have on the IDRIS machines.

The User Support Team may be contacted directly:

  • By telephone at +33 (0)1 69 35 85 55 or by e-mail at
  • Monday through Thursday, 9:00 a.m. - 6:00 p.m. and on Friday, 9:00 a.m. - 5:30 p.m., without interruption.

Note: During certain holiday periods (e.g. Christmas and summer breaks), the support team staffing hours may be reduced as follows:

  • Monday through Friday, 9:00 a.m. - 12:00 (noon) and 1:30 p.m. - 5:30 p.m.

Outside of regularly scheduled hours, the answering machine of the User Support Team phone will indicate the currently relevant staffing hours (normal or holiday).

Administrative management for IDRIS users

For any problems regarding passwords, account opening, access authorisation, or in sending us the forms for account opening (FFCU) or account management (FGC), you must send an e-mail to: