Assignment 3 Task 2 a

Assignment 3 Task 2 a

by Mohammadmoein Moradi -
Number of replies: 2

Dear PTFS team,

in the solution to this part, ADD latency was considered as 4. But for calculating max performance, shouldn't we omit the latency and consider full pipelining for large N?

I attached my solution and I appreciate you correcting my justification.



In reply to Mohammadmoein Moradi

Re: Assignment 3 Task 2 a

by Georg Hager -

It was assumed that the compiler applies no unrolling, so the full ADD latency applies: The sum is accumulated into the same register with every ADD, so the ADDs cannot be pipelined.