We are asked to find the CPU cycles per 16 iterations (1 cache line length) but shouldn't it be 8 iterations? 1 CL = 64 bytes = 8 double precision data points. Is there something that I am missing here? Thanks in advance!
The data is single precision.
Oops. I missed that. Thank you Dr. Hager.