Ada, Adapp, Turing: Disk spaces

Three distinct disk spaces (HOME, WORKDIR and TMPDIR) are accessible to users on the Ada, Adapp and Turing computers. A fourth disk space, the ARCHIVE space, is only accessible on Adapp. Each space has specific characteristics adapted to its usage which are described on this page. The paths to access these spaces are stocked in 4 variables of the shell environment: $HOME, $WORKDIR, $TMPDIR and $ARCHIVE.

The HOME

$HOME : This is the home directory during an interactive connection. This space is intended for frequently-used small-sized files such as the shell environment files, the tools, and potentially the sources and libraries. This directory has a limited size (in space and in number of files).
The characteristics of the HOME are:

  • A permanent space.
  • Backed up daily by the TiNa software application.
  • Shared by the Adapp and Ada machines.
  • Accessible in interactive or batch jobs.
  • It is the user's home directory when beginning an interactive connection. It can also be accessed through the $HOME variable:

    $ cd $HOME
  • Submitted to group quotas which are intentionally rather low: 4GB by default. The IDRIS command quota_u allows you to see the real situation of your disk occupation and that of each of your group members.
  • The total HOME size is 4 TiB (4.4 TB) on Ada and Adapp and 2 TiB (2.2 TB) on Turing (default settings).
  • Intended to receive small-sized files, the block size is 256 KiB (262 KB) (stat -f $HOME command).

The WORKDIR

$WORKDIR : This is a permanent work and storage space which is usable in batch. In this space, we generally store large-sized files which are used during batch jobs: data files, executable files, result or restart files, submission scripts and very large source files. The characteristics of WORKDIR are:

  • A permanent space
  • Not backed up
  • Common to the 3 machines: Adapp, Ada and Turing
  • Accessible in interactive or in batch jobs
  • Composed of 2 sections:
    • A section in which each user has an individual part; it is accessed with the command:

      $ cd $WORKDIR
    • A section common to the UNIX group to which the user belongs. The files to be shared by all the group members can be placed here; it is accessed with the command:

      $ cd $COMMONDIR
  • Submitted to group quotas : 1 TiB (1.1 TB), by default. The IDRIS quota_u -w command allows you to see the real situation of your disk occupation and that of each of your group members.
  • The total WORKDIR size is 932 TiB (1024 TB). This size can be very easily increased by request from the project manager (c.f. Extranet).
  • Intended to receive large-sized files, the block size is 4 MiB (4.2 MB) (stat -f $WORKDIR command).
  • The WORKDIR is a GPFS disk space for which the bandwidth is shared between the Adapp, Ada and Turing machines (about 50 GiB/s in read/write). It can occasionally be saturated because of exceptionally intensive usage.

Usage recommendations:

  • Because the WORKDIR is not backed up, the files are not protected from the risk of accidental manual destruction (rm) or a disk failure. Therefore, it is necessary to regularly save the sensitive or important files in the Ergon archive server.

Attention :

  • Since batch jobs can run in the WORKDIR, the files are directly accessible in read/write (permanent space) and do not need to be explicitly copied. However, because several of your jobs can be run at the same time, you must create a unique execution directory for each of your jobs.
  • In addition, the disk space is submitted to group quotas and your job execution can stop suddenly if the quotas are reached. Therefore, in the WORKDIR, you must be aware of both your own activity in this disk space and that of your colleagues. For these reasons, you may prefer running your batch jobs in the TMPDIR.

The TMPDIR

$TMPDIR : An execution directory for batch jobs. The characteristics of the TMPDIR are:

  • It is a temporary directory.
  • It is only accessible in batch jobs by using the $TMPDIR variable.
  • It is automatically created when a batch job begins and is, therefore, unique to each batch job.
  • It is automatically destroyed at the end of the job: You must, therefore, copy the important files on a permanent disk space (WORKDIR, for example) before the job ends.
  • TMPDIR is not submitted to group quotas as is HOME or WORKDIR. However, some “security” quotas are put in place to avoid the situation where a user could unintentionally completely fill up all of the disk space because of an accidental usage error.
  • Total TMPDIR size is 466 TiB (512.3 TB).
  • Intended to receive large-sized files, the block size is identical to that of the WORKDIR: 4 MiB (4,2 MB) (the stat -f $TMPDIR command only works in batch).

Usage recommendations:

Examples of batch jobs using the TMPDIR can be found in the “Code execution/control” documentation (Ada here and Turing here). General advice for using the TMPDIR:

  • For all executions, we assume that the input, restart or executable files which are necessary for the execution have previously been stored on a permanent file system (HOME ou WORKDIR). If this is not the case, the first step is to use the archive class and the principle of multi-step jobs (example on Ada here and on Turing here) to copy the necessary files into the WORKDIR by using the mfget command.
  • At the beginning of each batch job execution, we advise you to be in the TMPDIR.
  • Copy the necessary files from WORKDIR into TMPDIR using the cp command.
  • Launch the execution in the TMPDIR.
  • Before the batch job finishes, you must:
    • Backup save the significant files (if they are used regularly or might be post-processed) in a permanent file system (for example, in WORKDIR) by using the cp command.
    • To archive the files for a longer time period, save them on the Ergon archive server by using the mfput command, implemented in the archive class in the last step of your batch job (example for Ada here or for Turing here).

Comments:

  • As the performance in read/write is identical for WORKDIR and TMPDIR, you can avoid making copies between these two directories if the code is able to read or write the files directly into the HOME or the WORKDIR;
  • TMPDIR, like WORKDIR, is a GPFS disk space whose bandwidth (about 50 GiB/s in read and write) is shared by the Adapp, Ada and Turing machines. In consequence, the input/output performance can vary as it can be slowed down in the case of exceptionally intense usage.

Specific to Ada and Adapp: Visibility in interactive

The created TMPDIR can be either local or shared, depending on the characteristics of the batch job:

  • If local, at the execution node of Ada or Adapp, it is not visible in interactive from the connection nodes. This is generally the case in sequential or in OpenMP jobs.
  • If it is shared by all the nodes of Ada and Adapp, it is visible in interactive from the connection nodes. This is the case for multi-node and multi-step jobs. To find the name of this directory:
    1. Put the command set -x (in bash) at the beginning of your job to view the echo of the commands entered and then use the $TMPDIR variable.
    2. Consult the job output during the execution (by using the commands cat, less, or tail, etc., on the file name indicated in the LoadLeveler directive #@ output, which is generally written in your submission directory). In the job output, you will find the value of the $TMPDIR variable and its real location.

Here is an example of a batch job which follows the above procedure:

example
  # Loadleveler directives
  ...
  ...
  #@ queue
 
  # Bash script
  set -x
  cd $TMPDIR
  pwd -P      # real PATH of the TMPDIR
  ...

The ARCHIVE space (from Adapp only)

The ARCHIVE environment variable on Adapp corresponds to the HOME of the Ergon archive server which is directly accessible in read/write from Adapp, the pre- and post-processing machine, through the intermediary of a GPFS mount.

$ ls -l $ARCHIVE

The ARCHIVE space is only accessible from the Adapp nodes (in interactive or in batch). It is not accessible from the Ada or Turing nodes.

HOME and WORKDIR: Data security

To improve the protection of your data stored in your HOME and WORKDIR spaces, we recommend that you respect the security policy put in place at IDRIS.

Summary of disk space characteristics

Note: The last column, entitled ARCHIVE, only concerns Adapp and Ergon.

HOME WORKDIR TMPDIR ARCHIVE
Life span permanent permanent duration of batch job permanent
Shared by the computers common to Adapp and Ada; local on Turing common to Adapp, Ada and Turing no $ARCHIVE on Adapp, $HOME on Ergon
Backup saved yes no no no
Automatic deletion no no yes, at end of batch job no
Access in interactive yes yes no ($TMPDIR is not defined) yes from Adapp
Access in batch yes yes yes yes from Adapp
Block size 256 KiB (262 KB) 4 MiB (4.2 MB) 4 MiB (4.2 MB) 16 MiB (16.8 MB)
Group quotas quota_u : 5 GiB (5.4 GB) by default quota_u -w 1 TiB (1,1 TB) by default none quota_u : 1 TiB (1.1 TB) on Ergon