Jean Zay: Nsight Systems

Description

Nsight Systems is an NVIDIA performance analysis tool.

It has a graphical user interface (GUI) but can also be used from the command line.

Versions installed

The module command gives access to different versions of Nsight Systems.

To display the available versions:

$ module avail nvidia-nsight-systems
nvidia-nsight-systems/2021.1.1  nvidia-nsight-systems/2021.4.1  nvidia-nsight-systems/2022.5.1  
nvidia-nsight-systems/2021.2.1  nvidia-nsight-systems/2022.1.1 

Utilisation

Example: To use the 2022.1.1 de Nsight Systems version, you must load the corresponding module:

$ module load nvidia-nsight-systems/2022.1.1

After the correct module is loaded, using Nsight Systems is realized in two steps:

  1. Execution of your program in Nsight Systems (from the command line).
  2. Visualisation/Analysis of the results with the graphical interface.

Execution

The easiest way is to launch the execution from the command line in your Slurm scripts: You simply need to add the nsys profile command just before the name of your executable file (with eventually the options for selecting the type of sampling to perform).

Comments:

  • For the nsys profile command to be recognized, you need to have loaded the correct module beforehand (see above) either in the environment of your interactive session or in your work environment.
  • To obtain help concerning the nsys profile command options, type nsys profile --help.

During execution, Nsight Systems writes its files in the current directory. By default, these files are named report#.qdrep where # is a number incremented to avoid overwriting possible existing files. The file name can be specified via the -o <report_file> option. The name can contain %q{ENVIRONMENT_VARIABLE} markers which will be replaced by the value of the environment variable specified.

Important: If the file already exists, the execution will fail in order to prevent overwriting the preceding results. Before launching the execution, therefore, you must ensure that the file specified by the -o option does not already exist or use the -f option (with caution) to force overwriting the existing files.

Important: By default, Nsight Systems uses the /tmp system directory which is very limited in size to store the temporary data. In order for Nsight Systems to have a larger work space, it is essential to define the TMPDIR variable. For example, to use the JOBSCRATCH directory (specific to each job and destroyed at the end of the job):

export TMPDIR=$JOBSCRATCH
# To circumvent a bug in the current versions of Nsight Systems,
# it is also necessary to create a symbolic link which allows 
# pointing the /tmp/nvidia directory to TMPDIR.
ln -s $JOBSCRATCH /tmp/nvidia

The following is an example of a submission script for an MPI + OpenACC code initiating 4 processes:

job_vtune_mpi.slurm
#!/bin/bash
#SBATCH --job-name=nsight_systems   # Arbitrary name of the Slurm job
#SBATCH --output=%x.%j.out          # Standard output file of the job
#SBATCH --error=%x.%j.err           # Standard error file of the job
#SBATCH --ntasks=4                  # Number of MPI processes requested
#SBATCH --ntasks-per-node=4         # Number of MPI tasks per node (= number of GPUs per node)
#SBATCH --gres=gpu:4                # Number of GPUs per node
#SBATCH --cpus-per-task=10          # Number of CPU cores per task (a quarter of the node here)
# /!\ Important:  The following line is misleading but in Slurm vocabulary, 
# "multithread" refers to hyperthreading.
#SBATCH --hint=nomultithread        # 1 MPI process per physical core (no hyperthreading)
#SBATCH --time=00:20:00             # Job time hh:mm:ss (20mn here)
 
# Loading the modules of your choice
module load ...
# Loading Nsight Systems
module load nvidia-nsight-systems/2021.2.1
 
# Echo of commands 
set -x
 
# To not use the /tmp
export TMPDIR=$JOBSCRATCH
# To circumvent a bug in the current versions of Nsight Systems,
# it is also necessary to create a symbolic link which allows
# pointing the /tmp/nvidia directory to TMPDIR
ln -s $JOBSCRATCH /tmp/nvidia
 
# Profiling in OpenACC mode with the generation of a results file
# per process ("report_rank0.qdrep", "report_rank1.qdrep", etc)
srun nsys profile -t openacc -o "report_rank%q{SLURM_PROCID}" ./my_bin_exe

Visualisation/analysis of results

Visualisation of results is done with the nsys-ui <report_file> command by replacing <report_file> with the name of an analysis report which was generated previously.

Important: The graphical interface can be slow when it is used from a Jean Zay front-end through activating the X11 forwarding with ssh -X. It is possible to use a Jean Zay visualisation node or install Nsight Systems on your machine and transfer the reports to it for analysis.

Documentation

Complete documentation is available on the NVIDIA site.