# Hands-on #2: The divide instruction

First you need to start an interactive job on the Fritz cluster, which you should have done as "pre-homework" already.
Here's a quick reminder.
After logging into the frontend, you request one cluster node:

```bash
$ salloc -C hwperf -p singlenode --time=01:00:00
```

We have reservations in place so it should be no problem getting a node during normal working hours.

Now you're good to go.
Remember that it's a good idea to keep two shells open:
One for running jobs on a cluster node (see above) and a second one to do the editing, compiling, etc. on the frontend. The home directory is shared among all frontends and compute nodes.
The number of editors available on the compute nodes is limited, and the environment for compiling and linking may not be complete there.

------------------------------------------------------------------------------------------

We want to calculate the value of π by numerically integrating the function f(x)=4(1+x^2) from 0 to 1:

```c
int SLICES = 2000000000;
double delta_x = 1.0/SLICES;
for (int i=0; i < SLICES; i++) {
      x = (i+0.5)*delta_x;
      sum += (4.0 / (1.0 + x * x));
}
Pi = sum * delta_x;
```

You can find example programs in C and Fortran in the DIV folder.

Compile the code with the Intel compiler for AVX-512 vectorization:
```bash
$ module load intel
$ icx -std=c99 -Ofast -xHOST -qopt-zmm-usage=high div.c -o div.exe
```
or:
```bash
$ ifx -Ofast -xHOST -qopt-zmm-usage=high div.f90 -o div.exe
```
This compiles the code with the largest possible SIMD width on this CPU (512 bit).

Run the code with a fixed clock frequency of 2 GHz: 
```bash
$ srun --cpu-freq=2000000-2000000:performance ./div.exe
```
Does it produce a decent approximation?

