MPI+OpenMP: he-hy - Hello Hybrid! - pinning
Recap from previous exercises:
cd ~; cp -a ~xwwclabs/MPIX-HLRS . # copy the exercises
cd ~/MPIX-HLRS/he-hy # change into your he-hy directory
Contents:
job_*.sh # job-scripts to run the provided tools, 2 x job_*_exercise.sh
*.[c|f90] # various codes (hello world & tests) - NO need to look into these!
hawk/*.o* # hawk job output files
IN THE ONLINE COURSE he-hy shall be done in two parts:
first exercise = 1. + 2. + 3. + 4. (already done before)
second exercise = 5. + 6. + 7. (after the talk on pinning)
4. Recap from previous exercise: get to know the hardware
→ Find out about the hardware of compute nodes:
→ solution output = hawk/check-hw.o*
5. MPI-pure MPI: compile and run the MPI "Hello world!" program (pinning)
job_he-mpi_[default|ordered].sh, he-mpi.[c|f90], help_fortran_find_core_id.c
compile he-mpi - either C or Fortran - precompiled version = C:
C: mpicc -o he-mpi he-mpi.c
Fortran: gcc -c help_fortran_find_core_id.c
Fortran: mpif08 -o he-mpi he-mpi.f90 help_fortran_find_core_id.o
run he-mpi twice on login node with only 4 procs:
mpirun -n 4 ./he-mpi # unsorted
mpirun -n 4 ./he-mpi | sort -n # sorted
? Why is the output (most of the time) unsorted ? ==> here (he-mpi) you can use: ... | sort -n
submit he-mpi to a compute node (mpirun):
qsub -q R_mpix job_he-mpi_default.sh # hawk--> okayish
qsub -q R_mpix job_he-mpi_ordered.sh # hawk--> pinning is perfect
? Can you rely on the defaults for pinning ? ==> Always take care of & check correct pinning yourself !
6. MPI+OpenMP: :TODO: compile and run the Hybrid "Hello world!" program
job_he-hy_exercise.sh, he-hy.[c|f90], help_fortran_find_core_id.c
compile he-hy - either C or Fortran - precompiled version = C:
C: mpicc -qopenmp -o he-hy he-hy.c
Fortran: gcc -c help_fortran_find_core_id.c
Fortran: mpif08 -fopenmp -o he-hy he-hy.f90 help_fortran_find_core_id.o
run he-hy twice on login node with only 4 procs & 4 threads:
export OMP_NUM_THREADS=4
mpirun -n 4 ./he-hy # unsorted
mpirun -n 4 ./he-hy | sort -n # sorted
? Why is the output (most of the time) unsorted ? ==> here (he-hy) you can use: ... | sort -n
TODO:
→ Run he-hy on a compute node, i.e.: qsub -q R_mpix job_he-hy_exercise.sh
→ Oh no, this is not going to fly... --> we have to do better !
→ Look into: job_he-hy_exercise.sh
→ Do NOT YET do the pinning exercise, see below 7.
? Can you rely on the defaults for pinning ? ==> Always take care of & check correct pinning yourself !
7. MPI+OpenMP: :TODO: how to do pinning
job_he-hy_[exercise|solution].sh, he-hy.[c|f90]
TODO (see below for info):
→ Do the pinning exercise in: job_he-hy_exercise.sh
→ one possible solution = job_he-hy_solution.sh
# hawk --> pinning of MPI procs & OMP threads done perfectly
PINNING: (→ see also slides ##-##)
Pinning depends on:
batch system | PBSPro | \ |
---|---|---|
MPI library | MPT | | interaction between these ! |
startup | mpirun | / |
Always check your pinning !
→ job_he-hy...sh (he-hy.[c|f90] prints core_id)
→ print core_id in your application (see he-hy.*)
→ turn on debugging info & verbose output in job
→ monitor your job → login to nodes: top [1 q]