Hands-on: ccNUMA (Part 1)
ccNUMA analysis
Perform a scaling run of the MFPCG code over 16 and then all 32 cores of the node by filling the ccNUMA domains consecutively.
$ for i in 16 32; do likwid-perfctr -C E:N:${i} -g MEM_DP ./perf 2500 40000; done
Does the performance scale across the ccNUMA domains? If it doesn't, what could be the reason?
Last modified: Tuesday, 23 July 2024, 1:58 PM