Skip to main content
⚠ INFORMATION
This page was translated by an AI (LLM) with a cursory human check and is awaiting full review.

HPCToolkit: CPU/MPI Profiling

We invite you to consult the best practices for code profiling for general advice on performance analysis on Jean Zay.

Description

HPCToolkit is a suite of performance measurement and analysis tools based on statistical sampling. It allows attributing performance costs (time, hardware counters, inefficiencies) to the call context.

The toolchain mainly relies on:

  • hpcrun to collect measurements during execution;
  • hpcstruct to analyse the structure of the executable;
  • hpcprof to build the final analysis database;
  • hpcviewer to visualise the results.

For GPU profiling with HPCToolkit, see the HPCToolkit: GPU Profiling page.

Installed Versions

The module command provides access to the various versions of HPCToolkit and HPCViewer.

To display the available versions:

$ module avail hpctoolkit hpcviewer
hpctoolkit/2020.08.03 hpctoolkit/2024.01.1 hpctoolkit/2024.01.1-python3.9
hpctoolkit/2020.08.03-cuda hpctoolkit/2024.01.1-cuda hpctoolkit/2024.01.1-python3.10

hpcviewer/2020.07 hpcviewer/2024.02

For the CPU/MPI case, load a non--cuda version of HPCToolkit, for example:

$ module load hpctoolkit/2024.01.1-python3.10
$ module load hpcviewer/2024.02

After loading the modules, the following commands are available:

$ hpc
hpcprof hpcrun hpcstruct hpcviewer

Execution and Collection

Example of a Slurm script for an MPI code:

job_hpctoolkit_mpi.slurm
#!/bin/bash
#SBATCH --job-name=hpctoolkit_mpi
#SBATCH --output=%x.%j.out
#SBATCH --error=%x.%j.err
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=10
#SBATCH --hint=nomultithread
#SBATCH --time=00:20:00

module purge
module load ...
module load hpctoolkit/2024.01.1-python3.10

set -x

# Collecte des mesures
srun hpcrun ./my_mpi_exe

After execution, a hpctoolkit-*-measurements type measurement directory is generated.

Building the Analysis Database

Once the collection is complete, build the analysis database:

$ hpcstruct ./my_mpi_exe
$ hpcprof -S ./my_mpi_exe.hpcstruct -o hpctoolkit-my_mpi_exe-database ./hpctoolkit-my_mpi_exe-measurements

Adjust the names according to the directories/files actually generated during your execution.

Visualisation with HPCViewer

Using hpcviewer requires an SSH connection with X11 forwarding (ssh -X).

Launch the graphical interface on the analysis database:

$ module load hpcviewer/2024.02
$ hpcviewer hpctoolkit-my_mpi_exe-database

You can adjust the JVM memory if necessary:

$ hpcviewer -jh 3g hpctoolkit-my_mpi_exe-database
Attention

The graphical interface may be slow with X11 forwarding from a login node. You can use a visualisation node or transfer the analysis database to your local machine.

Documentation

Your opinion matters!

To give your feedback, report an error, or suggest an improvement, click here:

quick anonymous questionnaire

This questionnaire is temporary and will take less than a minute, so take the opportunity!