Hi;
I am confused about performance calculation. In assignment1-question 2; performance was calculated by using bottleneck latency of add-mult instructions. however in slide 3b-page 27, performance was calculated by using bottleneck throughput of store instruction. is performance independent from latency in the hardwares which have simd vectorization. in the same manner, is performance independent from latency in the hardwares which do not have simd vectorization?
Best regards.