Exercise: J1D-PtP-Blocking
In this exercise we consider the Laplace equation in one dimension, namely \( \frac{d^2V}{dx^2}=0 \) with Dirichlet boundary conditions, \( V|_{x=0}=0 \) and \( V|_{x=1}=1 \). It has a very simple analytical solution, \( V(x)=x \). Here we use the Jacobi method to solve it numerically so that we have the following stencil loop over all interior grid points:
\( V_i=\frac{V_{i+1}+V_{i-1}}{2} \)
By repeating the loop for many iterations, the numerical solution will converge, though in this exercise we do not seek how efficient and accurate the method is. Instead, we investigate how domain decomposition can be used to distribute the workload across all MPI processes by splitting the array of potential function (chunk of grid points). The ghost cell exchange is applied using the MPI point-to-point communications. Indeed, halo exchange is trivial in this exercise since there is only one element (scalar) on each side of the chunk.
- In the directory J1D-PtP-Blocking, there are f and c subdirectories for Fortran and C, respectively. If you have loaded the modules as described in exercise Hello, compilers and MPI wrappers are ready for use. The compilation process is facilitated using a Makefile.
- MPI_PROC_NULL is a constant that represents a dummy MPI process rank. When passing it to a send or receive operation for instance, it is guaranteed to succeed and return as soon as possible with no modifications to the receive buffer.
- The workload distribution is done by function/subroutine get_share_per_proc which determines the size of the chunk (and in the case of Fortran also the index of start and end grid points).
- There are 6 FIXME markers which should be replaced with the correct codes. They are related to
- defining the rank of left and right neighboring processes.
- specifying the halo for the exchange in the calls to MPI_Send and MPI_Recv.
- You can validate your solution of the exercise by comparing the domain file to the reference J1D-PtP-Blocking/ref/domain-00100000.pgm. The reference file has been generated with nstep=100000 and n=100001 (for Fortran n=100000), so you can compare only if you use these values. If there is no output when running the command cmp, then the result is correct. The command cmp should be run as
- cmp file1 file2
- In case of many grid points, one needs a huge number of iterations (a very large value for nstep variable) to obtain converged potential. errmax is the large value of error on all grid points as compared with the analytical solution. What is the value of errmax for n=1001 and nstep=\( 10^7 \)?
- What would happen if we replace MPI_Send with MPI_Ssend?