Solution: Load imbalance: SMxV
Prepare code:
$ source ~/Tools/env.sh
$ cd ~/Tools/SMXV
$ unset ${!SCOREP_*}
Build code:
$ make smxv-omp
Exercise 1: Run benchmark with number of threads between 1 and 16:
$ for t in {1..16}; do OMP_NUM_THREADS=$t ./smxv-omp yax_large.bin; done | nl
1 Runtime: 8.506669 sec
2 Runtime: 5.182805 sec
3 Runtime: 4.042448 sec
4 Runtime: 3.167305 sec
5 Runtime: 2.837955 sec
6 Runtime: 2.741896 sec
7 Runtime: 2.482917 sec
8 Runtime: 2.316289 sec
9 Runtime: 2.110901 sec
10 Runtime: 1.974172 sec
11 Runtime: 1.849776 sec
12 Runtime: 1.802529 sec
13 Runtime: 1.732268 sec
14 Runtime: 1.617819 sec
15 Runtime: 1.604507 sec
16 Runtime: 1.564077 sec
Ideal scaling would be 0.5 sec for 16 threads. Real scaling only ~34 % parallel efficiency with 16 threads. Most number of threads with >80 % parallel efficiency: 2 (82 %)
Exercise 3: Validate effect of optimization:
$ for t in {1..16}; do OMP_NUM_THREADS=$t OMP_SCHEDULE=dynamic,512 ./smxv-omp yax_large.bin; done | nl
1 Runtime: 8.512925 sec
2 Runtime: 4.504325 sec
3 Runtime: 3.240145 sec
4 Runtime: 2.416859 sec
5 Runtime: 2.073413 sec
6 Runtime: 2.006542 sec
7 Runtime: 1.659394 sec
8 Runtime: 1.563580 sec
9 Runtime: 1.470130 sec
10 Runtime: 1.407530 sec
11 Runtime: 1.332805 sec
12 Runtime: 1.316528 sec
13 Runtime: 1.267057 sec
14 Runtime: 1.259499 sec
15 Runtime: 1.250060 sec
16 Runtime: 1.228585 sec
~43 % with 16 threads. Most number of threads with >80 % parallel efficiency: 5 (82 %)
Last modified: Thursday, 20 June 2024, 8:43 PM