Turing: TotalView

TotalView  is a graphical parallel debugger. It supports the MPI message passing library, OpenMP and multithreaded applications with pthreads.

TotalView

Compilation

Your program must be compiled with the option -g.  On Turing, this option is activated by default, so you don't need to specify it. On the other hand, we advise you to use the option -qfullpath.  A complete list of IBM compiler debugging options is available on this page.

Launching TotalView

Remote display

TotalView is a graphical debugger (XWindows).  To use the graphical mode, it is necessary to set up a remote display from IDRIS to your terminal. If you are unsuccessful in setting up this display, please contact the User Support Team. It is also possible to launch TotalView without a graphical interface by using the command  tvcli but this will greatly reduce the benefits of TotalView.  The use of the command line is described in the Reference Guide.

Launching in parallel batch

The execution of an interactive parallel program on Turing is not possible. To debug, it is obligatory, therefore, to submit a job. In order to debug, you must submit a job. The following is an example of a submission batch for an MPI debugging session on 64 cores:

job.ll
  # @ job_name = TotalView
  # @ job_type = BLUEGENE
  # @ output = $(job_name).$(jobid)
  # @ error = $(output)
  # @ wall_clock_limit = 00:30:00
  # @ bg_size = 64
  # @ environment = $DISPLAY
  # @ queue
 
  # load the default version
  module load tv
 
  # after the module load, to avoid a verbose output
  set -x
 
  xterm -sb -e 'totalview runjob -a -np 64 --ranks-per-node 16 --exe ./my_executablefile --args arg1 arg2'

This script uses the module command to access TotalView. TotalView is launched in a terminal in order to allow the displaying of the program console output. 

Remember:  Because of the material architecture of Turing, the reservations are carried out on blocks of 64 nodes (1024 cores). As in the example, it is possible to debug by using only some of the allocated nodes but the totality of the reserved resources are billed. In addition, you must be careful not to leave a debugging session open uselessly after it has ended (verification  possible with llq -u my_login).

Attention : A single run asking for 4096 tasks monopolises all of the license tokens. If you repeatedly encounter a license error message, or if you need a debug session for 4096 (or more) tasks, please contact the User Support Team.

Usage of TotalView

When the parameters (number of cores, name of the executable file, and arguments) have been indicated on the command line, it is not necessary to compleete the Arguments and Parallel tabs of the Startup Parameter window which is displayed at the beginning. This window can, therefore, be immediately closed.

Fenetre_Initialisation

To launch the debugging, click on the green Go arrow in the tool bar of the principal window. TotalView will then ask if you want to stop just before the program execution. By clicking on Yes, it is possible to navigate in the source code (right click on the function name, then left click on Dive) and add the breakpoints (by clicking on the line numbers).

Complementary resources

The complete documentation of TotalView (in English) is available here:

The recent updates of the TotalView documentation are available on the editor site.