Ergon : DODS at IDRIS

logo_dods.jpg

Author : Raphaël Medeiros

Date : April 2017 (original document December 2002 ; previous revision : February 2015)

Brief theoretical introduction

DODS (Distributed Oceanographic Data Systems) is a client/server software allowing access to oceanographic data, either local or distant, through the intermediary of an httpd server.

DODS can be viewed as an extension of the Web. As the HTML language permits the transfer of texts, sounds and images through the network, DODS permits transferring data by using the same http protocol. Hence, with the help of a browser or an application, a French researcher can access data stored on sites found elsewhere in the world simply by displaying an HTML page in his/her Web browser.

The first version of DODS was conceived in the 1990s. The DODS project has been called OPeNDAP since 2003. Moreover, an ESGF (Earth System Grid Federation) server is now operating at IDRIS.

DODS consists of two parts:

  • The server, which assures:
    • Data storage
    • Data conversion to the DODS format
    • Possible data compression
    • Sending the data to the client
  • The client, which assures:
    • Decompression of the data if needed
    • Conversion of the data into the format expected by the client
    • Putting the data at the disposal of the user's application

dods_client_serveur.jpg

The server part is composed of an httpd server (for example, Apache). Several clients are available: for example, netCDF, HDF, MatLab, and Ferret.

The DODS server of IDRIS

At the request of the climatology community, IDRIS installed a DODS server with the objective of making available a part of the data which is stored on the IDRIS Ergon archive server.

This archive server is composed of:

  • A cluster of 13 servers
  • A 2 PB disk cache
  • Several high-speed links to and from the computation servers
  • A tape storage robot with a capacity of several thousand tapes
  • Dozens of high performance tape readers
  • An HSM (Hierarchical Storage Manager) software

This equipment allows an on-line (“online” + “nearline”) storage capacity of several thousand terabytes (PB). It is principally used to archive perennial data resulting from numerical models executed on the IDRIS computing servers. The owner of the data can decide to make a part of it available to the world community: At IDRIS, this operation is made possible due to the installation of a DODS server and the development of the mfdods command on Ergon.

The characteristics of our DODS server, compared to a standard installation, are the following:

  • The data and the httpd server are located on different machines.
  • The accessed data are managed by an HSM and can be stored on the tapes.
  • Access to files stored on the Ergon archive server from the DODS server is read only .
  • The data owner user can, at any time, make the data accessible to the public or remove this access.

The commands available on the Ergon archive server

IDRIS has made a set of commands available on Ergon through which archived files can be made public, or inversely, access deleted, thanks to the DODS server. In either case, the options -? or -help, given after one of these commands, allows printing a summary of the command syntax.

mfdods

The mfdods command principally allows making a file accessible from the Ergon HOME through the intermediary DODS server. This command creates a link between a file (belonging to a user in his HOME) and the disk space visible from the DODS server, /linkhome/DODS/pub/your_login. This operation was intentionally made impossible with the usual Unix commands such as cp, ln, etc., in order to be able to manage the coherence between this DODS space and your HOME on Ergon: The mfdods command is indispensable.

The following is an example of its utilisation :

 
$ mfdods Analyse/TS_MO/v5.historicalCMR4_18500101_18591231_1M_TxT.nc
 link file : /linkhome/rech/lab/rlab001/Analyse/TS_MO/v5.historicalCMR4_18500101_18591231_1M_TxT.nc -> /linkhome/DODS/pub/rlab001/v5.historicalCMR4_18500101_18591231_1M_TxT.nc 



We see that, by default, the mfdods command creates the link in the root of each user's own DODS space, /linkhome/DODS/pub/your_login. It is possible to specify a different sub-directory with the option -d (directory) :

 
$ mfdods -d test_copy Analyse/TS_MO/v5.historicalCMR4_18500101_18591231_1M_TxT.nc
 link file : /linkhome/rech/lab/rlab001/IGCM_OUT/IPSLCM5A-MR/Analyse/TS_MO/v5.historicalCMR4_18500101_18591231_1M_TxT.nc -> /linkhome/DODS/pub/rlab001/test_copy/v5.historicalCMR4_18500101_18591231_1M_TxT.nc 

Note: If the sub-directory specified in option -d (here, test_copy) does not exist, it is created. However, one cannot create a complete sub-tree at the same time, as for example sous/ARBO/rescence : For this, it is necessary to use the dods_cp command.


You can also immediately remove public file access by using the same mfdods command to destroy the links, but with the option -r (remove) : For example, for the very first file:

 
$ mfdods -r /linkhome/DODS/pub/rlab001/v5.historicalCMR4_18500101_18591231_1M_TxT.nc
 unlink file /arch/home/DODS/pub/rlab001/v5.historicalCMR4_18500101_18591231_1M_TxT.nc 

or, with a relative path from your DODS directory:

 
$ mfdods -r /linkhome/DODS/pub/rlab001/v5.historicalCMR4_18500101_18591231_1M_TxT.nc
 unlink file /arch/home/DODS/pub/rlab001/v5.historicalCMR4_18500101_18591231_1M_TxT.nc 


A procedure, run on a regular basis, verifies the coherence between the DODS space and the “original” data archived in your HOME on Ergon: If the files in Ergon were erased, then the link in the DODS space is automatically destroyed. Similarly, the access rights of the Ergon HOME files are also verified.

Note :

  • The mfdods command accepts the standard substitution characters, * and ?, in filenames.
  • The mfdods command also exists on the pre- and post-processing server Adapp.

dods_cp

Using this script, you can recursively “graft” an entire directory from the Ergon HOME into the DODS space, while creating, as needed, the necessary sub-directories:

>  dods_cp PRACE_2-IP/kick_off/ DODS/pub/rlab001/sub_dir

dods_rm

With this script, you can recursively remove access to an entire directory of the published space in DODS (including all the sub-directories).
Attention: Avoid launching this command on the root of your DODS space. There is no means to reverse this (i.e. no backup) so it would be necessary, in this case, to re-create all the links with mfdods or dods_cp.

Example:

> dods_rm  /linkhome/DODS/pub/rlab001/arbo
 unlink file /arch/home/DODS/pub/rlab001/arbo/alter/nate/expcpl_d01_2002.nc 
/arch/home/DODS/pub/rlab001/arbo/alter/nate removed 
/arch/home/DODS/pub/rlab001/arbo/alter removed 
/arch/home/DODS/pub/rlab001/arbo removed 

dods_ls

This script displays, from a DODS file, the URL which is utilisable in a browser to directly display this file. To display the content of an ASCII file in a Web browser, the URL can be used as is; for a NetCDF file you only need to suffix this character chain with .html to display the file's OPeNDAP Dataset Access Form in the Web browser.

Example:

> ls -alF /linkhome/DODS/pub/rlab001
total 0
dr-xr-xr-x  4 root    sos  4096 Feb  5 14:47 ./
drwxr-xr-x 88 root    root 4096 Feb  4 16:16 ../
-r--r--r--  2 ssos251 sos    35 Jan 13 17:39 expcpl_d01_2002.nc
> dods_ls  DODS/pub/rlab001/expcpl_d01_2002.nc
https://prodn.idris.fr/thredds/fileServer/ipsl_public/rlab001/expcpl_d01_2002.nc

Note : All the dods_ scripts accept the standard substitution characters * and ?. Likewise, they all respond to the concise usage instructions (using only option -h).