Turing : addr2line

Description

The addr2line tool allows you to find the problem line in the source files through an address in the stack.

Usage

  • Usage : addr2line -e name_executable file address_stack
  • Compile with option -g

If the application does not generate core files or if you wish to force the creation of them (useful for an MPI application which hangs, for example), you can position the environment variable BG_COREDUMPONEXIT at 1 in order to force the writing of the core files at the end of the execution.

Example

Here is an example of a program having a table overflow:

  1  program addr2line
  2    implicit none
  3
  4    integer               :: i
  5    integer, parameter    :: n=5000, maxidx=10000
  6    integer, dimension(n) :: tab
  7
  8    tab(:) = [ (2*i-1, i=1,n) ]
  9
 10    print *, 'tab(maxidx)=', tab(maxidx)
 11  end program
 

This program is compiled with the following command :

> bgxlf90_r -g test_addr2line.f90 -o test_addr2line

After this, it is necessary to execute the program via a submission batch.

> cat submit.ll
# @ job_name = job_simple
# @ job_type = BLUEGENE
# @ output = $(job_name).$(jobid)
# @ error = $(output)
# @ wall_clock_limit = 00:10:00
# @ bg_size = 64
# @ queue
set -x
cd $workdir
runjob --ranks-per-node 16  --np 1 : ./test_addr2line
 
> llsubmit submit.ll

A core file is then generated (core.0) which indicates to us the address position in the code during the crash (stack addresses in the section of the same name):

> cat core.0
+++PARALLEL TOOLS CONSORTIUM LIGHTWEIGHT COREFILE FORMAT version 1.0
+++LCB 1.0
Program   : ./test_addr2line
Job ID    : 15869
Personality:
   ABCDET coordinates : 0,0,0,0,0,0
   Rank               : 0
   Ranks per node     : 16
   DDR Size (MB)      : 16384
+++ID Rank: 0, TGID: 1, Core: 0, HWTID:0 TID: 1 State: RUN
***FAULT Encountered unhandled signal 0x0000000b (11) (SIGSEGV)
  ...
  ...
  ...
+++STACK
Frame Address     Saved Link Reg
0000001fbfff6b40  0000000001012ce4
0000001fbfff6be0  0000000001000468
0000001fbfffbaa0  000000000105d328
0000001fbfffbd80  000000000105d624
0000001fbfffbe40  0000000000000000
---STACK
Interrupt Summary:
  System Calls 27
  External Input Interrupts 1
  Data TLB Interrupts 1
---ID
---LCB

The command addr2line allows correspondence to take place between the stack address and the problem line in the source code. To do this, it is necessary to make a small transformation in order to give the right information to addr2line : You simply need to replace the first eight zeros of the values of the column Saved Link Reg by 0x. For the second value of the above example, this gives the following:

> powerpc64-bgq-linux-addr2line -e ./test_addr2line 0x01000468
~rlab001/turing/formation_bg/trunk/demos/addr2line/test_addr2line.f90:10

Note that the stack contains several addresses (a call stack) and that it could be necessary to use addresses which are deeper in the stack to find the crash origin (especially if you call external libraries).