Walkthrough: MiniMD Trace Collection
Prepare environment with paths for Score-P and Vampir:
$ source ~k78q0039/env.sh
$ cp ~k78q0039/Tools-Material ~
Build the MiniMD without likwid:
$ cd ~/Tools-Material/MINIMD
$ make
Run the uninstrumented MiniMD proxy app with 2 ranks and 10 threads each (ensure, that you are in the 'MINIMD/data' directory):
$ cd data
$ OMP_NUM_THREADS=10 srun -n 2 ../miniMD-ICC -t 10 --half_neigh 1
Modify build settings to use Score-P instrumenter 'scorep' in all $(CC)
, $(CXX)
, and $(FC)
commands:
$ cd ..
$ cat include_ICC.scorep.mk
Build with modified Makefile configuration:
$ make TAG=ICC.scorep
$ cd data
Run instrumented MiniMD proxy app:
$ export SCOREP_EXPERIMENT_DIRECTORY=scorep-minimd-2x10-profile
$ OMP_NUM_THREADS=10 srun -n 2 -c 10 ../miniMD-ICC.scorep -t 10 --half_neigh 1
Examine scoring:
$ scorep-score scorep-minimd-2x10-profile/profile.cubex
Estimated aggregate size of event trace: 157MB
Estimated requirements for largest trace buffer (max_buf): 79MB
Estimated memory requirements (SCOREP_TOTAL_MEMORY): 99MB
(hint: When tracing set SCOREP_TOTAL_MEMORY=99MB to avoid intermediate flushes
or reduce requirements using USR regions filters.)
flt type max_buf[B] visits time[s] time[%] time/visit[us] region
ALL 82,096,353 6,311,108 112.52 100.0 17.83 ALL
USR 79,570,842 6,120,244 95.80 85.1 15.65 USR
OMP 1,752,280 134,650 13.25 11.8 98.41 OMP
COM 696,670 53,590 3.03 2.7 56.46 COM
MPI 76,520 2,622 0.44 0.4 169.19 MPI
SCOREP 41 2 0.00 0.0 97.98 SCOREP
List also individual functions causing trace buffer requirements:
$ scorep-score -r scorep-minimd-2x10-profile/profile.cubex
Estimated aggregate size of event trace: 157MB
Estimated requirements for largest trace buffer (max_buf): 79MB
Estimated memory requirements (SCOREP_TOTAL_MEMORY): 99MB
(hint: When tracing set SCOREP_TOTAL_MEMORY=99MB to avoid intermediate flushes
or reduce requirements using USR regions filters.)
flt type max_buf[B] visits time[s] time[%] time/visit[us] region
ALL 82,096,353 6,311,108 112.52 100.0 17.83 ALL
USR 79,570,842 6,120,244 95.80 85.1 15.65 USR
OMP 1,752,280 134,650 13.25 11.8 98.41 OMP
COM 696,670 53,590 3.03 2.7 56.46 COM
MPI 76,520 2,622 0.44 0.4 169.19 MPI
SCOREP 41 2 0.00 0.0 97.98 SCOREP
USR 36,642,554 2,817,976 0.46 0.4 0.16 Neighbor::coord2bin
USR 30,670,848 2,359,296 0.33 0.3 0.14 random
USR 5,115,604 393,154 0.06 0.1 0.15 Atom::pack_border
USR 5,115,396 393,154 0.05 0.0 0.14 Atom::unpack_border
USR 1,703,936 131,072 0.02 0.0 0.17 Atom::addatom
OMP 156,312 12,024 0.07 0.1 6.19 !$omp for @atom.cpp:170
OMP 156,312 12,024 10.24 9.1 851.86 !$omp implicit barrier @atom.cpp:175
OMP 156,312 12,024 0.12 0.1 9.70 !$omp for @atom.cpp:182
OMP 156,312 12,024 0.03 0.0 2.66 !$omp implicit barrier @atom.cpp:188
OMP 156,312 12,024 1.53 1.4 127.07 !$omp barrier @comm.cpp:372
COM 156,312 12,024 0.00 0.0 0.40 Atom::pack_reverse
COM 156,312 12,024 0.00 0.0 0.33 Atom::unpack_reverse
OMP 148,200 11,400 0.11 0.1 9.58 !$omp for @atom.cpp:158
OMP 148,200 11,400 0.02 0.0 1.61 !$omp implicit barrier @atom.cpp:163
OMP 148,200 11,400 0.27 0.2 24.12 !$omp barrier @comm.cpp:322
Generate an initial filter file base on default heuristic:
$ scorep-score -g scorep-minimd-2x10-profile/profile.cubexExamine filter file:
An initial filter file template has been generated: 'initial_scorep.filter'
To use this file for filtering at run-time, set the respective Score-P variable:
SCOREP_FILTERING_FILE=initial_scorep.filter
For compile-time filtering 'scorep' has to be provided with the '--instrument-filter' option:
$ scorep --instrument-filter=initial_scorep.filter
Compile-time filtering depends on support in the used Score-P installation.
The filter file is annotated with comments, please check if the selection is
suitable for your purposes and add or remove functions if needed.
$ cat initial_scorep.filter
Apply filter file to determine expected reduction:
$ scorep-score -f initial_scorep.filter scorep-minimd-2x10-profile/profile.cubex
Estimated aggregate size of event trace: 3969kB
Estimated requirements for largest trace buffer (max_buf): 1985kB
Estimated memory requirements (SCOREP_TOTAL_MEMORY): 23MB
(hint: When tracing set SCOREP_TOTAL_MEMORY=23MB to avoid intermediate flushes
or reduce requirements using USR regions filters.)
flt type max_buf[B] visits time[s] time[%] time/visit[us] region
- ALL 82,096,353 6,311,108 112.52 100.0 17.83 ALL
- USR 79,570,842 6,120,244 95.80 85.1 15.65 USR
- OMP 1,752,280 134,650 13.25 11.8 98.41 OMP
- COM 696,670 53,590 3.03 2.7 56.46 COM
- MPI 76,520 2,622 0.44 0.4 169.19 MPI
- SCOREP 41 2 0.00 0.0 97.98 SCOREP
* ALL 2,031,667 152,876 111.53 99.1 729.57 ALL-FLT
+ FLT 80,064,686 6,158,232 0.99 0.9 0.16 FLT
- OMP 1,752,280 134,650 13.25 11.8 98.41 OMP-FLT
* USR 117,806 9,062 94.83 84.3 10464.72 USR-FLT
* COM 85,020 6,540 3.01 2.7 459.98 COM-FLT
- MPI 76,520 2,622 0.44 0.4 169.19 MPI-FLT
- SCOREP 41 2 0.00 0.0 97.98 SCOREP-FLT
+ USR 36,642,554 2,817,976 0.46 0.4 0.16 Neighbor::coord2bin
+ USR 30,670,848 2,359,296 0.33 0.3 0.14 random
+ USR 5,115,604 393,154 0.06 0.1 0.15 Atom::pack_border
+ USR 5,115,396 393,154 0.05 0.0 0.14 Atom::unpack_border
+ USR 1,703,936 131,072 0.02 0.0 0.17 Atom::addatom
- OMP 156,312 12,024 0.07 0.1 6.19 !$omp for @atom.cpp:170
- OMP 156,312 12,024 10.24 9.1 851.86 !$omp implicit barrier @atom.cpp:175
Apply filter file to measurement and re-run MiniMD:
$ export SCOREP_EXPERIMENT_DIRECTORY=scorep-minimd-2x10-profile+filter
$ export SCOREP_FILTERING_FILE=initial_scorep.filter
$ OMP_NUM_THREADS=10 srun -n 2 -c 10 ../miniMD-ICC.scorep -t 10 --half_neigh 1
Re-examine effect of filter file:
$ scorep-score scorep-minimd-2x10-profile+filter/profile.cubex
Estimated aggregate size of event trace: 3969kB
Estimated requirements for largest trace buffer (max_buf): 1985kB
Estimated memory requirements (SCOREP_TOTAL_MEMORY): 23MB
(hint: When tracing set SCOREP_TOTAL_MEMORY=23MB to avoid intermediate flushes
or reduce requirements using USR regions filters.)
flt type max_buf[B] visits time[s] time[%] time/visit[us] region
ALL 2,031,667 152,876 115.94 100.0 758.39 ALL
OMP 1,752,280 134,650 15.33 13.2 113.86 OMP
USR 117,806 9,062 97.80 84.4 10792.11 USR
COM 85,020 6,540 2.12 1.8 324.89 COM
MPI 76,520 2,622 0.69 0.6 261.65 MPI
SCOREP 41 2 0.00 0.0 118.20 SCOREP
Enable trace file collection:
$ export SCOREP_EXPERIMENT_DIRECTORY=scorep-minimd-2x10-tracing
$ export SCOREP_ENABLE_TRACING=true
$ export SCOQREP_TOTAL_MEMORY=23MB
$ OMP_NUM_THREADS=10 srun -n 2 -c 10 ../miniMD-ICC.scorep -t 10 --half_neigh 1
Examine experiment result:
$ ls -la scorep-minimd-2x10-tracing
(on login nodes or cshpc)
$ vampir scorep-minimd-2x10-tracing/traces.otf2
Last modified: Thursday, 29 June 2023, 1:54 PM