Hands-on #1: Dot product manual throughput analysis
                                    Completion requirements
                                    
                                    
                                        
                                    
                                
                        Do a manual throughput analysis of a dot product kernel on Sapphire Rapids. The code is compiled using AVX2 and uses 4x unrolling. Move the micro-ops on the right into the appropriate execution slots on the left, ignore instruction dependencies.
Make sure that each port is occupied for the minimum possible number of cycles (see the STREAM Triad example in the lecture).