PPHPS-2025: Hands-On: logging in, editing, compiling, and starting a job at LRZ

The CoolMUC-4 Linux Cluster at LRZ

Exercises will be done on the new CoolMUC-4 Linux cluster at LRZ.

See https://doku.lrz.de/coolmuc-4-1082337877.html for further information.

Logging in to the CoolMUC-4 cluster

Login under Linux:

Open xterm
ssh -Y cool.hpc.lrz.de -l username
Use username, password and 2nd factor 2FA (provided by LRZ staff).

Login under Windows:

We recommend to use MobaXterm https://mobaxterm.mobatek.net/ (comes with bulit-in X11 support).
Open a new session via Session -> SSH.
Specify remote host: cool.hpc.lrz.de and username as provided by LRZ staff.
Enter password and 2nd factor 2FA (provided by LRZ staff) into the opened console.
As an alternative to MobaXterm you can also use the terminal software putty (https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html) together with the Xming X11 server for Windows: https://sourceforge.net/projects/xming/ .

Login under Mac:

Install X11 support for MacOS XQuartz: https://www.xquartz.org/ .
Open Terminal.
ssh -Y cool.hpc.lrz.de -l username
Use username, password and 2nd factor 2FA (provided by LRZ staff).

This will log you in to one of the "CoolMUC-4" frontends, cm4login[?-?]. Wherever you log in, your $HOME will always be the same.

See https://doku.lrz.de/access-and-login-to-the-linux-cluster-10745974.html for further information on how to access the Linux cluster.

Copying example code to your home directory

You will mostly be working on example code that we provide. The file(s) for the examples are available in a central directory. In order to copy a specific code, e.g., "HELLO", you just type:

$ cp -a /lrz/sys/courses/PPHPS25/HELLO/ ~

This copies the whole folder, including any subfolders, into your $HOME.

Editing and compiling code

Standard editors like vim and emacs are available on the frontend nodes. It is good practice to have at least two terminal windows open: one with a connection to the frontend to do editing, compiling, and other interactive stuff, and one for submitting a job (see below).

Most software, including compilers and some libraries, is available via the "modules" system. The "module load <x>" command loads software module "<x>", i.e., it sets some environment variables (e.g., PATH) so you can use the software.

On CoolMUC-4 Intel compiler, MPI and MKL modules are no longer loaded by default. Users can activate the Intel-OneAPI toolkit environment by inserting the following command in their corresponing SLURM scripts:

$ module load intel-toolkit

After that, you have a C (icx), C++ (icpx), and a Fortran (ifx) compiler at your disposal. In addition, an Intel MPI module is loaded. The example codes you will be working with are quite simple. The Intel compiler accepts the same options (well, mostly) as the GCC (here shown for a C code):

$ icx -Ofast my_source.c

In order to make the compiler recognize OpenMP directives, you have to add "-qopenmp" to the command line. For MPI code you have to use one of the provided wrapper scripts (mpiicx, mpiicpx, mpiifx). They behave like normal compilers except that they "know" where to find the MPI headers and libraries.

It is good practice to do all editing and compiling on a frontend node. Not all tools and libraries you need may be available on the compute nodes.

Running a batch job

In order to run something on the cluster, you have to start a "batch job," i.e., you specify some amount of resources and the system decides when these resources will be available to you. During the course 10 nodes (1120 cores) of CoolMUC-4 are reserved for the course participants.

To run jobs on CoolMUC-4 during the course you have to use the SLURM batch scheduler, see

Sample OpenMP job file for allocating 1 node (112 physical cores) on CoolMUC-4 and running 112 OpenMP threads:

--snip--

#!/bin/bash

#SBATCH -D ./
#SBATCH -o ./%x.%j.%N.out
#SBATCH -e ./%x.%j.%N.err
#SBATCH -J cmtest
#SBATCH --clusters=cm4
#SBATCH --partition=cm4_tiny
#SBATCH --qos=cm4_tiny
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=112
#SBATCH --get-user-env
#SBATCH --export=NONE##SBATCH --reservation=hppb1w24
#SBATCH --time=00:10:00

module load slurm_setup
module load intel-toolkit

#for OpenMP
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./openmp-program

--snap--

Sample MPI job file for allocating 1 node (112 physical cores) on CoolMUC-4 and running 112 MPI tasks:

--snip--

#!/bin/bash

#SBATCH -D ./
#SBATCH -o ./%x.%j.%N.out
#SBATCH -e ./%x.%j.%N.err
#SBATCH -J cmtest
#SBATCH --clusters=cm4
#SBATCH --partition=cm4_tiny
#SBATCH --qos=cm4_tiny
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=112
#SBATCH --cpus-per-task=1
#SBATCH --get-user-env
#SBATCH --export=NONE
##SBATCH --reservation=hppb1w24
#SBATCH --time=00:10:00
module load slurm_setup
module load intel-toolkit

#for MPI
mpiexec -n $SLURM_NTASKS ./mpi-program

--snap--

Please do not use "#SBATCH --mail-type=…" as this could be considered as a denial of service attack on the LRZ mail hub!

Submit a job
```
sbatch --reservation=hppb1w24 job.sh
```
List own jobs
```
squeue -M cm4
```
Cancel jobs
```
scancel -M cm4 jobid
```
Information about the cm4 cluster segment
```
sinfo -M cm4
```
Show reservation
```
scontrol -Mcm4 show reservation
```

The reservation is only valid during the course, for general usage on our Linux Cluster remove the "--reservation=hppb1w24" from your job scripts.

Last modified: Tuesday, 18 February 2025, 1:19 PM