Skip to main content

46 posts tagged with "flashinfo"

Le Flash Info est la lettre électronique envoyée à tous les utilisateurs de l'IDRIS. Son objectif est d'informer ses lecteurs des dernières nouveautés concernant l'IDRIS. Son rythme de parution dépend de l'information à diffuser.

View all tags

Flash Info No 2024-26

IDRIS
IDRIS
Computing center
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

[TOOL_CALLS]Arrêt électrique : mardi 8 et mercredi 9 octobre 2024

[English version below]

Bonjour,

Une maintenance de l'infrastructure technique de l'IDRIS nécessitera la mise à l'arrêt du centre mardi 8 et mercredi 9 octobre prochains. La machine Jean Zay sera indisponible du mardi 8 octobre à 6h jusqu'au jeudi 10 octobre à 12h. L'espace STORE sera indisponible dès l'après-midi du lundi 7 octobre.

Le service de support aux utilisateurs sera fermé pendant ces deux jours. Le site web restera accessible et vous pourrez suivre la disponibilité des machines sur la page habituelle : http://www.idris.fr/statut.html.

Nous sommes désolés pour la gêne que cette opération pourrait occasionner.

Cordialement, L'équipe support de l'IDRIS


Hello,

A maintenance operation on the technical infrastructure of IDRIS will require the centre to be shut down completely on Tuesday 8th and Wednesday 9th October. The Jean Zay machine will be unavailable from Tuesday 8th October at 6 am until Thursday 10th October at noon. The STORE disk space will be unavailable starting from Monday 7th October in the afternoon.

User support will be closed for these two days. The website will remain accessible and you will be able to check the machine availability on the usual page: http://www.idris.fr/status.html.

We are sorry for any inconvenience this maintenance operation might cause.

Best regards, The IDRIS User Support Team

Flash Info No 2024-25

IDRIS
IDRIS
Computing center
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.
  • Summary:

  • Panoram'IA: Join us on Friday 27/09 at 10 am

  • IDRIS Training


  • Panoram'IA: Join us on Friday 27/09 at 10 am

IDRIS support invites you this Friday morning 27/09 at 10 am to "Panoram'IA": the monthly live video magazine that covers the scientific and technical news of AI. On the agenda for this session: AI news, TransformerEngine, HyperCloning, and our selection of papers with Papers Storm. The live stream and replays are available on our YouTube channel "Un oeil sur l'IDRIS": https://www.youtube.com/@idriscnrs. The following sessions will take place on 18 October 🎃, 15 November, and 6 December 🎅.

  • IDRIS Training

Register now for the IDRIS training sessions scheduled for the rest of the year:

  • OpenMP, from 16 to 18 October
  • Practical Introduction to Deep Learning, on 4 and 5 November
  • Deep Learning Architectures, on 6 and 7 November
  • Deep Learning Launch Workshop, on 8 November
  • HPC Debugging, on 22 November
  • SIMD Vectorisation, on 26 November
  • Introduction to the PETSc Library, on 5 and 6 December

For more information on the 2024 IDRIS training catalogue and registration procedures: http://www.idris.fr/formations/catalogue.html.


Flash Info No 2024-24

IDRIS
IDRIS
Computing center
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

Opération sur le gestionnaire de travaux Slurm le 1er octobre

[English version below]

Bonjour,

Dans le cadre de l'intégration de la nouvelle partition H100 à la machine Jean Zay, une opération est prévue sur le gestionnaire de travaux Slurm le mardi 1er octobre 2024 de 8h à 13h. La machine sera indisponible pendant cette période.

Contrairement aux opérations de maintenance habituelles, les travaux en attente qui n'auront pas pu s'exécuter avant la maintenance seront perdus. Il vous faudra les soumettre à nouveau au redémarrage de la machine.

De plus, la base de données contenant l'historique des travaux Slurm sera réinitialisée. Vous n'aurez donc plus accès aux informations concernant les travaux antérieurs à la maintenance via la commande "sacct". Néanmoins cela n'impactera pas la comptabilité des heures consommées visible via la commande "idracct".

Nous sommes désolés pour la gêne que cette opération pourrait occasionner.

Cordialement, L'équipe support de l'IDRIS


Hello,

As part of the integration of the new H100 partition to the Jean Zay machine, an operation is planned on the job scheduler Slurm on Tuesday, October 1st 2024 from 8am to 1pm. The machine will be unavailable during that time.

Unlike the usual maintenance operations, the pending jobs that could not run before the maintenance operation will be lost. You will have to resubmit them after the machine is back online.

Moreover, the database which stores the Slurm jobs history will be reset. This means you will not have access anymore to the information about past jobs using the "sacct" command. However, there will be no impact on the accounting of your computing hours visible using the "idracct" command.

We are sorry for any inconvenience this maintenance operation might cause.

Best regards, The IDRIS User Support Team

Flash Info No 2024-23

IDRIS
IDRIS
Computing center
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

Summary:

  • Next IDRIS UC Meeting: Thursday 26 September 2024
  • IDRIS at the Science Festival on 11 and 12 October 2024
  • IDRIS Training

  • Next IDRIS UC Meeting: Thursday 26 September 2024

The next User Committee (UC) meeting of IDRIS will take place on 26 September 2024 at the IDRIS premises. Do not hesitate to send your requests to: cu-elus at idris.fr. More information: http://www.idris.fr/cu.html

  • IDRIS at the Science Festival on 11 and 12 October 2024

Following the success of previous years, IDRIS is renewing the experience and participating again in the CNRS Unusual Visits organised as part of the Science Festival. Registration is open from 2 to 22 September 2024! For more information: http://www.idris.fr/annonces/idris-fete-de-la-science.html.

  • IDRIS Training

Remember to register now for the IDRIS training sessions planned for the rest of the year:

  • MPI, from 24 to 27 September
  • Jean Zay Workshop, on 3 and 4 October
  • OpenMP, from 16 to 18 October
  • Optimised Deep Learning on Jean Zay, from 22 to 25 October
  • Practical Introduction to Deep Learning, on 4 and 5 November
  • Deep Learning Architectures, on 6 and 7 November
  • Deep Learning Launch Workshop, on 8 November
  • HPC Debugging, on 22 November
  • SIMD Vectorisation, on 26 November
  • Introduction to OpenACC and OpenMP GPU, from 27 to 29 November
  • Introduction to the PETSc Library, on 5 and 6 December

For more information on the 2024 IDRIS training catalogue and registration procedures: http://www.idris.fr/formations/catalogue.html.


Flash Info No 2024-22

IDRIS
IDRIS
Computing center
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

Dear Jean Zay user,

As part of the renewal of the storage spaces accompanying the H100 extension of Jean Zay, the current SCRATCH disk space will be permanently shutdown on Tuesday, September 3rd, 2024.

As previously announced, there will be no automatic copy of the current SCRATCH space to the new one.

To anticipate this shutdown, we invite you to migrate as soon as possible the data you wish to keep to the new space already accessible through the $NEWSCRATCH (or $ALL_CCFRNEWSCRATCH for the common project spaces) and to use that disk space in your jobs by modifying your submission scripts accordingly.

The data transfers can be done using one or more jobs on the "archive" partition, or interactively from the login nodes if the volume to be transferred is limited: http://www.idris.fr/eng/jean-zay/modifications-extension-jean-zay-h100-eng.html#scratch_et_all_scratch_copies.

In order to avoid confusions between the different disk spaces and facilitate the monitoring of the migration operation, we suggest that you remove any data that you will not need anymore or that you have already successfully migrated.

Please be aware that the new SCRATCH space is also subject to the 30 day deletion policy of the unused data.

Best regards, The IDRIS support team

Flash Info No 2024-21

IDRIS
IDRIS
Computing center
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

Migration of WORK and HOME

[English version below]

Hello,

As previously announced, the arrival of the H100 extension comes with a renewal of the Jean Zay storage spaces with the installation of a new Lustre storage system that will offer increased storage capacity and improved bandwidth.

The migration of the HOME spaces was completed during this morning's maintenance (July 30th, 2024). We invite you to check your scripts to correct any hard-coded paths. Any path of the form "/gpfs7kw/linkhome/..." should become "/linkhome/..." or, if possible, be replaced by the use of the $HOME environment variable.

The migration of the WORK spaces started today. This operation is also handled by the IDRIS teams, so you do not have any specific actions to perform. This operation will be done in batches to avoid a long downtime of the machine. However, it will require suspending the "qos_cpu-t4" and "qos_gpu-t4" QoS, which allow running jobs longer than 20 hours.

For a specific project, the migration process will be as follows:

  • 20 hours before the migration begins, jobs using the project's computing hours will no longer be able to start to avoid jobs trying to access the WORK space during the operation (they will then appear with the status "AssocGrpJobsLimit")
  • just before the migration starts, the project's WORK space will become completely unavailable, including from the login nodes
  • once the migration is completed, the environment variables will be modified to point to the new WORK spaces on the Lustre storage system, and the pending jobs will be able to run again.

Warning: If you have jobs that use the computing hours of one project but access the WORK space of another project, they may fail because we will not be able to block their start appropriately.

A command "idr_migstatus" allows you to monitor the migration of your projects by indicating the current status of each:

  • "pending": the migration has not yet been performed, your jobs can still run, and you have access to your WORK
  • "planned": the migration will start in the next 24 hours, new jobs can no longer start, but you still have access to your WORK
  • "in progress": the migration is in progress, you no longer have access to your WORK
  • "migrated": the migration is completed, you have access to your WORK again, and your jobs can run again.

Note: The absolute path of the WORK spaces will change with the migration, but to simplify the transition, links will be set up so that the old absolute paths remain functional, at least initially. Once the migration is completed, we still invite you to modify any paths of the form "/gpfswork/..." or "/gpfsdswork/projects/..." that may appear in your scripts (if possible by replacing them with the use of the environment variable) or in your symbolic links.

We apologise for any inconvenience these operations may cause.

Best regards, The IDRIS support team


Dear Jean Zay user,

As previously announced, the installation on the new H100 extension comes with a renewal of the Jean Zay storage spaces with the installation of a new Lustre storage system offering an increased storage capacity and an improved bandwidth.

The migration of the HOME spaces is completed since today's maintenance operation (July 30th, 2024). We invite you to check your scripts in order to correct any hard-coded paths. Any path starting with "/gpfs7kw/linkhome/..." should become "/linkhome/..." or, if possible, the $HOME environment variable should be used instead.

The migration of the WORK spaces started today. This operation is also handled by the IDRIS teams so you do not have any specific actions to perform. The migration will be done by batch of projects to avoid having a long downtime of the machine. It will however require suspending the "qos_cpu-t4" and "qos_gpu-t4" QoS which allow running jobs of more than 20h.

For a specific project, the migration process will be as follow:

  • 20h before the migration begins, the jobs using computing hours allocated to that project will be held in queue (with the "AssocGrpJobsLimit" status) in order to avoid having jobs that use the WORK during the migration operation
  • just before the migration starts, the WORK space will become completely unavailable, including from the login nodes
  • once the migration is done, the environment variables will be modified to point to the new Lustre WORK space and your jobs will be able to run again.

Warning: If you have jobs that use the computing hours from a project but access the WORK disk spaces of another project, they might fail because we have no way to prevent them from starting when they should not.

The "idr_migstatus" command allows to monitor the migration of your projects by indicating the current status of each of them:

  • "pending" : the migration has not started yet, there is no impact on your jobs and you can access your WORK
  • "planned" : the migration is going to start in the next 20h, jobs that are not yet running will stay pending but you can still access your WORK
  • "in progress" : the migration is in progress, you will not have access to your WORK at this point
  • "migrated" : the migration is done, you can access your WORK again and your jobs can run.

Note: The absolute paths of the WORK spaces will be modified by the migration. However to ease the transition, symbolic links will be created in order to keep the old absolute paths working, at least for some times. Once the migration is completed, we do invite you to modify any absolute paths starting with "/gpfswork/..." or "/gpfsdswork/projects/..." that could appear in your scripts (use the environment variables whenever possible) or in your symbolic links.

We are sorry for the inconvenience those operations might cause.

Best regards, The IDRIS support team

Flash Info No 2024-20

IDRIS
IDRIS
Computing center
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

SCRATCH Migration Procedure

[English version below]

Hello,

As previously announced, the arrival of the H100 extension comes with a renewal of the Jean Zay storage spaces with the installation of a new Lustre storage system offering increased storage capacity and improved bandwidth.

To anticipate the shutdown of the current SCRATCH space, we invite you to start using the new space accessible via the environment variable $NEWSCRATCH (or $ALL_CCFRNEWSCRATCH for the common project space) by modifying your submission scripts accordingly. A subsequent communication will specify the date of this shutdown with at least 15 days' notice.

There will be no automatic copying of the current SCRATCH content to the new space. Therefore, if you wish to keep certain data currently stored on the SCRATCH, make sure to copy it to your new space $NEWSCRATCH (for example, via one or more jobs on the "archive" partition or interactively from the login nodes if the volume to be transferred is limited) and delete data you no longer need.

Please note that this new SCRATCH space is also subject to the automatic deletion of data not used for 30 days.

For more information on the ongoing operations on Jean Zay as part of the H100 extension installation: http://www.idris.fr/jean-zay/modifications-extension-jean-zay-h100.html.

Best regards, The IDRIS support team


Dear Jean Zay user,

As previously announced, the installation on the new H100 extension comes with a renewal of the Jean Zay storage spaces with the installation of a new Lustre storage system offering an increased storage capacity and an improved bandwidth.

To anticipate the shut down of the current SCRATCH space, we invite you as of now to start using the new SCRATCH space already accessible through the $NEWSCRATCH (or $ALL_CCFRNEWSCRATCH for the common project spaces) by modifying accordingly your submission scripts. The actual shut down date of the current SCRATCH will be announced later with at least a 15 day notice.

There will be no automatic copy of the current SCRATCH space to the new one. Therefore, if you wish to keep some data currently stored on the SCRATCH, be sure to copy it to your new space $NEWSCRATCH (for instance through one or more jobs on the "archive" partition, or interactively from the login nodes if the volume to be transferred is limited) and to remove data you do not need anymore.

Please be aware that the new SCRATCH space is also subject to the 30 day deletion policy of the unused data.

For more information on the work in progress on Jean Zay for the H100 extension: http://www.idris.fr/eng/jean-zay/modifications-extension-jean-zay-h100-eng.html.

Best regards, The IDRIS support team

Flash Info No 2024-19

IDRIS
IDRIS
Computing center
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

Extension of STORE Unavailability

[English version below]

Hello,

Due to technical difficulties during the migration of the STORE space (see our previous communication for more details on the migration: http://www.idris.fr/flash-info-idris/flash-info-de-l-idris-167.html), the unavailability of this disk space must unfortunately be extended. Access to the STORE should be restored by the end of Thursday, 25 July. We invite you to consult the "Machine availability" page of our website (http://www.idris.fr/status.html) and the message of the day displayed when logging in to Jean Zay for the most up-to-date information.

As a reminder, it is still possible to access the old STORE space in read-only mode using the "$OLDSTORE" environment variable.

We apologise for any inconvenience caused.

Best regards, The IDRIS support team


Dear Jean Zay users,

Due to technical difficulties encountered during the migration of the STORE space (cf. our last email for more details regarding the migration: http://www.idris.fr/flash-info-idris/flash-info-de-l-idris-167.html), this disk space will unfortunately be unavailable for a longer period than expected. Access to the STORE should be possible again by the end of Thursday July 25th. You can check the "Machine availability" webpage (http://www.idris.fr/status.html) and the message of the day displayed when logging in on Jean Zay for the latest updates.

Please note that it remains possible to have read-only access to the old STORE using the "$OLDSTORE" environment variable.

We are sorry for the inconvenience this may cause.

Best regards, The IDRIS support team

Flash Info No 2024-18

IDRIS
IDRIS
Computing center
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

[TOOL_CALLS]Migration of the STORE on 22 and 23 July

[English version below]

Hello,

The STORE disk space will be completely unavailable on Monday 22 and Tuesday 23 July to migrate it to the new Lustre storage system installed as part of the Jean Zay H100 extension and the expansion of the WORK space. Therefore, please ensure that you do not submit any jobs using the STORE during this period (other jobs will continue to run normally).

As announced on 18 June, there will be a change in the STORE access policy on this occasion. It will no longer be possible to access this space from the compute nodes. Access to the STORE will be restored on Tuesday 23 July in the evening ONLY on the login nodes and on the "prepost", "visu", "compil" and "archive" partitions.

Until the end of August, it will still be possible to access the old STORE in read-only mode using the "$OLDSTORE" environment variable. This access method should be preferred in the weeks following the migration operation. Indeed, the data of the new STORE will then be available only on magnetic tapes, which could significantly slow down data access (up to several hours), while the rotating disk cache is repopulated with the most recently used data. Note that this new environment variable is already set so that you can anticipate the modification of your scripts.

As a reminder, the STORE is a space dedicated to the secure and long-term storage of archived data. Currently, there is redundancy of all data, stored both on rotating disks and magnetic tapes. The presence of data on rotating disks allows for relatively fast read/write access. In the future, only the most recently used data will be available on the rotating disk cache (still with a security copy on magnetic tapes). The rest of the data will be stored only on magnetic tapes (with two copies on different tapes to ensure data security) with much longer access times, incompatible with direct use from computations.

We invite you to modify your submission scripts if you access the STORE space directly from the compute nodes. To guide you, examples have been added at the end of our documentation on multi-step jobs: http://www.idris.fr/jean-zay/cpu/jean-zay-cpu-exec_cascade.html

Best regards, The IDRIS support team


Dear Jean Zay users,

The STORE disk space will be totally unavailable on Monday July 22nd and Tuesday July 23rd in order to migrate its data onto the new Lustre storage system installed in the framework of the Jean Zay H100 extension and the enlargement of the WORK disk space. Please make sure you do not submit any jobs using the STORE during this time (other jobs will continue to run normally).

As announced on June 18th, a change in the STORE access policy will take place after the migration, in that it won't be possible anymore to access this disk space from compute nodes. In turn, access to the STORE disk space will be again possible starting from Tuesday July 23rd evening, but only from login nodes and from the "prepost", "visu", "compil" and "archive" partitions.

Until the end of August, it will remain possible to access the old STORE using the "$OLDSTORE" environment variable. This way to access your archived data will be recommended in the weeks following the migration. Indeed, the data on the new STORE will first be available only on magnetic tape (with long access times, possibly up to several hours), while the rotating disk cache will be repopulated with the most recently used data. Note that this new environment variable is already defined so that you can modify your scripts in advance.

As a reminder, the STORE is a disk space dedicated to long term secured storage of archived data. In the current system, all the data is redundantly stored on rotating disks ("cache") and magnetic tapes, and its availability on rotating disks enables a relatively fast read/write access time. In the future, only the most recently used data will be available on the rotating disk cache, with a security copy on the magnetic tapes. The remainder of the data will be stored only on magnetic tapes (with a double copy on different tapes to guarantee its security) with a much longer access time, incompatible with a direct usage from compute nodes.

We invite you to change your submission scripts if you currently access the STORE space directly from the compute nodes. In order to help you, several examples have been added at the end of the multi-step jobs documentation: http://www.idris.fr/eng/jean-zay/cpu/jean-zay-cpu-exec_cascade-eng.html.

Best regards, The IDRIS support team

Flash Info No 2024-17

IDRIS
IDRIS
Computing center
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.
  • Summary:

  • Panoram'IA: Join us on Friday 5 July at 10 am

  • Reminder: Changes to STORE access

  • IDRIS is recruiting for the 2024 external competitions

  • IDRIS Training

  • Panoram'IA: Join us on Friday 5 July at 10 am

IDRIS support invites you to "Panoram'IA" on Friday morning, 5 July at 10 am: the monthly live video magazine covering scientific and technical AI news. This session will include: AI news, Jean Zay H100 extension, feedback on CVPR2024 and our selection of papers with Papers Storm. The live stream and replays are available on our YouTube channel "Un oeil sur l'IDRIS": https://www.youtube.com/@idriscnrs.

  • Reminder: Changes to STORE access

As announced in the Flash Info of 18 June 2024, access to the STORE from compute nodes will no longer be possible from Tuesday 9 July 2024: http://www.idris.fr/flash-info-idris/flash-info-de-l-idris-165.html.

  • IDRIS is recruiting for the 2024 external competitions

As part of the 2024 CNRS external competitions, IDRIS is recruiting for the following positions:

  • Competition No. 65: 1 Scientific Computing Expert, shared position with Maison de la simulation (IR)
  • Competition No. 67: 1 Scientific Computing Expert (IR)
  • Competition No. 221: 1 Infrastructure Manager (AI)
  • Competition No. 235: 1 Administrative Assistant (AI).

For more information: http://www.idris.fr/annonces/idris-recrute.html.

  • IDRIS Training

Register now for the IDRIS training sessions scheduled for the new academic year:

  • Hybrid MPI/OpenMP Programming, 9 and 10 September
  • MPI, 24 to 27 September
  • Jean Zay Workshop, 3 and 4 October
  • OpenMP, 16 to 18 October
  • Optimised Deep Learning on Jean Zay, 22 to 25 October.

For more information on the 2024 IDRIS training catalogue and registration procedures: http://www.idris.fr/formations/catalogue.html.


Your opinion matters!

To give your feedback, report an error, or suggest an improvement, click here:

quick anonymous questionnaire

This questionnaire is temporary and will take less than a minute, so take the opportunity!