Hands-on #1: Dot product manual throughput analysis
Manual throughput analysis of a dot product kernel on ICX. The code is compiled using AVX2 and uses 4x unrolling. Move the micro-ops on the right into the appropriate execution slots on the left, ignore instruction dependencies.
Make sure that each port is occupied for the minimum possible number of cycles (see the STREAM Triad example in the lecture).
This content is displayed in preview mode. No attempt tracking will be stored.