ccNUMA analysis

Perform a scaling run of the MFPCG code over 16 and then all 32 cores of the node by filling the ccNUMA domains consecutively.

$ for i in 16 32; do likwid-perfctr -C E:N:${i} -g MEM_DP ./perf 2500 40000; done

Does the performance scale across the ccNUMA domains? If it doesn't, what could be the reason?

Last modified: Tuesday, 23 July 2024, 1:58 PM