ParProg20: Assignment 2 | NHR Learning Platform

Slow computing. Assume that a parallel program behaves in accordance with Amdahl's Law plus communication overhead (serial fraction s, c(N)=kN). We compare the scalability of this program on two machines. Both have the same communication network, but machine 1 is three times faster in terms of code execution than machine 2.
(a) Prove that the slow machine "shows better speedup" than the fast machine.
(b) Does it matter?
A performance model. On slide 22 of Lecture 3 we analyzed a simple loop:
```
#pragma omp parallel for
for(i=0; i<n; ++i)
          a[i] = a[i] + s * c[i];
```
As opposed to our analysis in the lecture, in reality the OpenMP parallelization does have some overhead. In this case it amounts to roughly 2000 CPU cycles (with 8 cores at a clock speed of 3 GHz).

Make a model of the expected performance in Gflop/s of this loop with respect to n, the loop length, on an 8-core CPU with a peak performance of 192 Gflop/s and a memory bandwidth of 40 Gbyte/s. You can assume that all the arrays (even if they are short) are in main memory. At which n do we get half of the asymptotic (i.e., large-n) performance?

Last modified: Wednesday, 4 November 2020, 4:23 PM