Skip to main content
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

HPCToolkit: GPU Profiling

We invite you to consult the best practices for code profiling for general advice on performance analysis on Jean Zay.

Description

HPCToolkit allows profiling GPU (CUDA) applications by sampling, then analysing the results with hpcviewer.

For a timeline and CUDA kernels oriented analysis on Jean Zay, see also:

Installed Versions

The module command provides access to the versions of HPCToolkit (including the -cuda variants) and HPCViewer.

To display the available versions:

$ module avail hpctoolkit hpcviewer
hpctoolkit/2020.08.03 hpctoolkit/2024.01.1 hpctoolkit/2024.01.1-python3.9
hpctoolkit/2020.08.03-cuda hpctoolkit/2024.01.1-cuda hpctoolkit/2024.01.1-python3.10

hpcviewer/2020.07 hpcviewer/2024.02

For the GPU case, load a -cuda version, for example:

$ module load hpctoolkit/2024.01.1-cuda
$ module load hpcviewer/2024.02

Execution and Collection

Example of a Slurm script for an MPI + CUDA code:

job_hpctoolkit_gpu.slurm
#!/bin/bash
#SBATCH --job-name=hpctoolkit_gpu
#SBATCH --output=%x.%j.out
#SBATCH --error=%x.%j.err
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=4
#SBATCH --gres=gpu:4
#SBATCH --cpus-per-task=10
#SBATCH --hint=nomultithread
#SBATCH --time=00:20:00

module purge
module load ...
module load hpctoolkit/2024.01.1-cuda

set -x

# Collecte des mesures
srun hpcrun ./my_gpu_exe

After execution, a measurement directory of type hpctoolkit-*-measurements is generated.

Note

The choice of events/metrics depends on the version of HPCToolkit and your application. Consult hpcrun --help to adapt the collection.

Building the Analysis Database

Once the collection is complete, build the analysis database:

$ hpcstruct ./my_gpu_exe
$ hpcprof -S ./my_gpu_exe.hpcstruct -o hpctoolkit-my_gpu_exe-database ./hpctoolkit-my_gpu_exe-measurements

Adjust the names according to the directories/files actually generated during your execution.

Visualisation with HPCViewer

Using hpcviewer requires an SSH connection with X11 forwarding (ssh -X).

Launch the graphical interface on the analysis database:

$ module load hpcviewer/2024.02
$ hpcviewer hpctoolkit-my_gpu_exe-database

You can adjust the JVM memory if necessary:

$ hpcviewer -jh 3g hpctoolkit-my_gpu_exe-database
Warning

The graphical interface may be slow with X11 forwarding from a login node. You can use a visualisation node or transfer the analysis database to your local machine.

Documentation

Your opinion matters!

To give your feedback, report an error, or suggest an improvement, click here:

quick anonymous questionnaire

This questionnaire is temporary and will take less than a minute, so take the opportunity!