Assignment 2
- Slow computing. Assume that a parallel program behaves in accordance with Amdahl's Law plus communication overhead (serial fraction s, c(N)=kN). We compare the scalability of this program on two machines.
Both have the same communication network, but machine 1 is three times faster in terms of code execution than machine 2.
(a) Prove that the slow machine "shows better speedup" than the fast machine.
(b) Does it matter? - A performance model. On slide 22 of Lecture 3 we analyzed a simple loop:
#pragma omp parallel for
As opposed to our analysis in the lecture, in reality the OpenMP parallelization does have some overhead. In this case it amounts to roughly 2000 CPU cycles (with 8 cores at a clock speed of 3 GHz).
for(i=0; i<n; ++i) a[i] = a[i] + s * c[i];
Make a model of the expected performance in Gflop/s of this loop with respect to n, the loop length, on an 8-core CPU with a peak performance of 192 Gflop/s and a memory bandwidth of 40 Gbyte/s. You can assume that all the arrays (even if they are short) are in main memory. At which n do we get half of the asymptotic (i.e., large-n) performance?
Last modified: Wednesday, 4 November 2020, 4:23 PM