The analysis is very similar to the dense MVM case we covered in the lecture: if the x[] vector fits into a cache, the data traffic it causes to main memory is negligible (since it will be read once and reused from cache N-1 times). If it does not fit into a cache, it must be read again from memory in every iteration of the outer loop.