Exercise: Parallelization of a raytracer code (optional)
The RAY folder contains a serial ray tracer code (in a Fortran90 and a C version), which computes a pretty picture. It writes the picture to a file called "result.pnm". Look at the file using, e.g., the "gm display" command available on the frontend:
$ gm display result.pnm
The central function is calc_tile(), which computes one tile of the picture. The size of one tile and of the whole picture is hardcoded at the start of the main program. Note that the code assumes that the picture size is a multiple of the tile size. For this exercise, set them to 6000x6000 and 1000x1000, respectively, to begin with. The program outputs its runtime and its performance in million pixels per second (MPixel/s).
- Parallelize the code with OpenMP-parallel loops. You can deactivate the output for testing, but make sure that your parallel code computes the correct result (this is easy since you can always display the picture). What speedup does your program get from 1 to 36 cores on a Fritz socket?
- Do you see any optimization potential in the way the code is parallelized? Think about how the work is distributed among the threads, and how you can influence this distribution.
- (optional) Parallelize the code with OpenMP tasks. Is there a performance difference to the loop-parallel version?
Last modified: Wednesday, 21 February 2024, 11:42 AM