Ada : FPMPI2

Description

FPMPI2 est une bibliothèque de profilage des appels MPI développée par l'ANL. La version 2.2 est disponible sur Ada.

Elle collecte dans un seul fichier texte :

  • la liste des sous-programmes MPI appelés,
  • la taille des messages,
  • le temps passé dans les appels MPI pour chaque sous-programme ainsi que le temps perdu à cause des désynchronisations,
  • les quantités de données transférées entre tous les processus.

Utilisation

  • Charger le module FPMPI2 par la commande : module load fpmpi2
  • compiler avec l'option -g
  • exécuter l'application normalement; un fichier fpmpi_profile.txt est généré à la fin de l'exécution.

Exemple

$ cat fpmpi_profile.txt

MPI Routine Statistics (FPMPI2 Version 2.2)
Options: FPMPI enabled, Collective sync, Collect destinations,
Explanation of data:
Times are the time to perform the operation, e.g., the time for MPI_Send
Average times are the average over all processes, e.g., sum (time on each
process) / number of processes
Min and max values are over all processes
(Data is always average/min/max)
Amount of data is computed in bytes.  For point-to-point operations,
it is the data sent or received.  For collective operations, it is the
data contibuted to the operation.  E.g., for an MPI_Bcast, the amount of
data is the number of bytes provided by the root, counted only at the root.
For synchronizing collective operations, the average, min, and max time
spent synchronizing is shown next.
Calls by message size shows the fraction of calls that sent messages of a
particular size.  The bins are
0 bytes, 1-4 bytes, 5-8 bytes, 9-16, 17-32, 33-64, -128, -256, -512, -1024
 -4K, -8K, -16K, -32K, -64K, -128K, -256K, -512K, -1M, -4M, -8M, -16M,
 -32M, -64M, -128M, -256M, -512M, -1GB, >1GB.
Each bin is represented by a single digit, representing the 10's of percent
of messages within this bin.  A 0 represents precisely 0, a . (period)
represents more than 0 but less than 10%.  A * represents 100%.
Messages by message size shows similar information, but for the total
message size.

The experimental topology information shows the 1-norm distance that the
longest point-to-point message travelled, by process.

MPI_Pcontrol may be used to control the collection of data.  Use the values
defined in fpmpi.h, such as FPMPI_PROF_COLLSYNC, to control what data is
collected or reported by FPMPI2.
Command: ...

Date:           Tue Jan 22 11:16:31 2013
Processes:      16
Execute time:   37.37
Timing Stats: [seconds] [min/max]       [min rank/max rank]
  wall-clock: 37.37 sec 37.365183 / 37.387304   14 / 2
        user: 36.63 sec 36.608434 / 36.649428   11 / 6
         sys: 0.2375 sec        0.216967 / 0.262960     6 / 8

Memory Usage Stats (RSS) [min/max KB]:  225192/233804

                  Average of sums over all processes
Routine                 Calls       Time Msg Length    %Time by message length
                                                    0.........1........1........
                                                              K        M
MPI_Allreduce       :       5   0.000242         40 00*0000000000000000000000000
MPI_Gather          :       2   0.000596         16 00*0000000000000000000000000
MPI_Sendrecv        :      40       1.36    6.4e+06 0000000000000000*00000000000

Details for each MPI routine
                  Average of sums over all processes
                                                   % by message length
                                (max over          0.........1........1........
                                 processes [rank])           K        M
MPI_Allreduce:
        Calls     :          5            5 [   0] 00*0000000000000000000000000
        Time      :   0.000242     0.000279 [  15] 00*0000000000000000000000000
        Data Sent :         40           40 [   0]
        SyncTime  :       1.68         3.19 [   3] 00*0000000000000000000000000
        By bin    : 5-8 [5,5]   [  0.000197,  0.000279] [     0.188,      3.19]
MPI_Gather:
        Calls     :          2            2 [   0] 00*0000000000000000000000000
        Time      :   0.000596      0.00416 [   8] 00*0000000000000000000000000
        Data Sent :         16           16 [   0]
        By bin    : 5-8 [2,2]   [  1.48e-05,   0.00416]
MPI_Sendrecv:
        Calls     :         40           40 [   0] 0000000000000000*00000000000
        Time      :       1.36         2.92 [  10] 0000000000000000*00000000000
        Data Sent :    6.4e+06      6400000 [   0]
        By bin    : 131073-262144       [40,40] [     0.164,      2.92]
        Partners  :          3 max 4(at 5) min 2(at 0)

Summary of target processes for point-to-point communication:
1-norm distance of point-to-point with an assumed 2-d topology
(Maximum distance for point-to-point communication from each process)
  1  1  1  1
  1  1  1  1
  1  1  1  1
  1  1  1  1
Data volume for each rank:   source     dest       bytes,...
0       1       1600000,        4       1600000,
1       0       1600000,        2       1600000,        5       1600000,
2       1       1600000,        3       1600000,        6       1600000,
3       2       1600000,        7       1600000,
4       0       1600000,        5       1600000,        8       1600000,
5       1       1600000,        4       1600000,        6       1600000,        9       1600000,
6       2       1600000,        5       1600000,        7       1600000,        10      1600000,
7       3       1600000,        6       1600000,        11      1600000,
8       4       1600000,        9       1600000,        12      1600000,
9       5       1600000,        8       1600000,        10      1600000,        13      1600000,
10      6       1600000,        9       1600000,        11      1600000,        14      1600000,
11      7       1600000,        10      1600000,        15      1600000,
12      8       1600000,        13      1600000,
13      9       1600000,        12      1600000,        14      1600000,
14      10      1600000,        13      1600000,        15      1600000,
15      11      1600000,        14      1600000,

Documentation