Skip to main content

Job submission on Dalia

⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

Slurm Job Manager

Jobs are managed on Dalia by the Slurm software, as on Jean Zay.

The usual commands allow you to control jobs:

  • sbatch: Submission of a batch file
  • srun: Execution of a task
  • squeue: Checking jobs in the queue
  • scancel: Cancellation of a job
important

From the login node, you need to load the module slurm to use Slurm commands (if not already done):

module load slurm/slurm/24.11

Submission via an Apptainer container

It is recommended to work in Apptainer containers on Dalia.

Example of a submission script:

#!/usr/bin/env bash
#SBATCH --job-name=test_dalia
#SBATCH --output=slurm_log/%x_%j.out
#SBATCH --error=slurm_log/%x_%j.out
## Reservation de la totalité des ressources d'un noeud : 144 CPUs et 4 GPUs
#SBATCH --nodes=1 # Nombre de noeuds
#SBATCH --gpus-per-node=4 # Max 4 GPU par noeud
#SBATCH --ntasks-per-node=4 # Nombre de tache par noeud
#SBATCH --cpus-per-task=36 # Nombre de CPU par tache : 4 * 36 = 144 CPUs
## Temps limite d'execution du travail (HH:MM:SS)
#SBATCH --time=0:40:00

cd $PROJECT_DIR
export APPTAINER_CACHEDIR=/lustre/work/<project_group>/<login>/<cache_directory>
srun apptainer exec --nv --pwd /my_project_dir --bind $PROJECT_DIR:/my_project_dir mon_container.sif <commande> # Use --nv to enable nvidia support
  • --nv allows the use of NVIDIA GPUs in the container;
  • --pwd defines the working directory in the container;
  • --bind allows mounting the directory $PROJECT_DIR to /my_project_dir in the container.

Your opinion matters!

To give your feedback, report an error, or suggest an improvement, click here:

quick anonymous questionnaire

This questionnaire is temporary and will take less than a minute, so take the opportunity!