This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

Optimised Deep Learning on Jean Zay (DLO-JZ)
Manager : LÊo Mantegazza
Instructors : Members of the IDRIS support team
The aim of this training is to review the main current techniques for optimising machine learning in Deep Learning, with the goal of scaling on a supercomputer. The associated issues of acceleration and distribution on multiple GPUs will be addressed, from a system and algorithmic point of view.
- Target audience
- Prerequisites
- Duration and practical info
- Course content
- Course materials
- Upcoming sessions
Target audienceâ
This training is intended for AI users of Jean Zay or people who master the fundamentals of Deep Learning and who wish to learn about the challenges of scaling up.
Prerequisitesâ
The required prerequisites are:
- master the concepts of learning in Deep Learning
- master Python
- have basics in PyTorch for the proper execution of the practical work
Duration and practical infoâ
This training lasts 4 days:
- from 09:00 to 12:00 and
- from 13:00 to 17:00.
It takes place exclusively in-person at the premises of IDRIS in Orsay (91).
Attendanceâ
Minimum : 10 people ;
Maximum : 16 people.
Course contentâ
This training is dedicated to the multi-GPU scaling of the training of a Deep Learning model. The common thread of the practical aspects of the training is the optimized scaling (acceleration, distribution) of a model training on the ImageNet dataset, in PyTorch. For this, participants will be led to code and submit their calculations on Jean Zay applying the different concepts covered during the course sections.
Planâ
Day 1â
- Presentation of the DLO-JZ training
- The Jean Zay supercomputer
- The challenges of scaling up
- GPU acceleration
- Mixed precision
- Tensor formats optimization (channels last memory format)
- PyTorch Profiler
Day 2â
- Dataloader optimization
- Distributed training: general concepts and data parallelism
- Data augmentation
Day 3â
- Storage and format of input data (webdataset)
- Large batch training techniques (lr scheduler, large batch optimizers, ...)
- Compilation (JIT)
Day 4â
- Best practices (small models)
- Managing large models and Fully Sharded Data Parallelism (FSDP)
- Model parallelism
- Training visualization tools (TensorBoard, MLflow, Weights & Biases, ...)
- Optimization and hyperparameter search techniques
For an efficient execution of the practical parts, these will take place on the Jean Zay supercomputer. A workstation with access to the IDRIS supercomputer is provided to the learners. Experience in using a supercomputer, as well as prior access to it, are not required.
Course materialsâ
All course materials, including slides, notes, and practical exercises, are provided under the following license: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). For more details on the license, please consult this page.
Slidesâ
-1- Introduction, Jean Zay & GPU DL
- 2- Profiler -3- DataLoader optimization -4- Distribution and Data Parallelism
- 5- Storage spaces and data format
- 6- Optimizers & large batches
- 7- Compiler
- 8- Good Practice and State Of The Art
- 9- Parallelisms and large models -10- Visualization Tools -11- Hyperparameters optimization
- Conclusion
- Practical
To view the dates of the upcoming sessions for this training, go to the following page:
Registration
CNRS/French university staff | External participants |
Are you a member of CNRS or a French university staff? Your registration is free via our server. | Our training is aimed at all professionals from companies, public bodies and individuals. |