Next: Evaluation Up: Partial Differential Equations Previous: Partial Differential Equations   Contents

## An Application

In this section we will discuss a full application of HPJava--a multigrid solver. The particular solver was adapted from an existing Fortran program (called PDE2), taken from the Genesis parallel benchmark suite [5]. The whole of this program has been ported to HPJava (it is about 800 lines), but in this section we will only consider a few critical routines.

The Multigrid [10] method is a fast algorithm for solution of linear and nonlinear problems using restrict and interpolate operations between current grids (fine grid) and restricted grids (coarse grid). As applied to basic relaxation methods for PDEs, it hugely accelerates elimination of the residual by restricting a smoothed version of the error term to a coarse grid, computing a correction term on the coarse grid, then interpolating this correction back to the original fine grid where it is used to improve the original approximation to the solution. Multigrid methods can be developed as a V-cycle method for simple linear iterative methods. As we can see in Figure 6.1, there are three characteristic phases in a V-cycle method; pre-relaxation, multigrid, and post-relaxation. The pre-relaxation phase makes the error smooth by performing a relaxation method. The multigrid phase restricts the current problem to a subset of the grid points and solves a restricted problem for the correction term. The post-relaxation phase perform some steps of the relaxation method again after interpolating results back to the original grid.

As an example we take red-black relaxation for the Laplace equation as our relaxation method. The relax operation, the restrict operation, and the interpolate operation are three critical parts of a solution of 2D Poisson equation using a multigrid method. Domain decomposition, which assigns part of the grid to each processor, is used for parallelization. Boundary values of the subgrids are exchanged with nearest neighbors before each calculation. Red-black relaxation on array uf is illustrated in Figure 6.2. In this red-black relaxation, the values of the subgrids boundary are exchanged by using the writeHalo method. We assume that all distributed arrays in our examples were created with ghost regions.

The implementation makes use of the integer global loop index i (read i-primed'') associated with distributed index i. This value is used in computing the lower bound of the inner overall. The modulo 2 operation including i and it, in conjunction with the loop stride of 2, ensures that sites of a single parity are updated in a given iteration it, and that this parity is swapped in successive iterations. This is a short way to write the algorithm, although it might be more efficient to split into several overall constructs dealing with sites of different party.

The restrict operation (Figure 6.3) computes the residual and restricts it to a coarser grid. The residual is computed only at points of the fine grid matching points in the coarse grid (hence the strides of 2 in the overall constructs). A work-array, tf, aligned with uf, is used to hold the residual values. We then use a remap operation to copy these values to the coarse grid.

The important remap operation copies one distributed array to another of the same shape which may have unrelated distribution format. We have made no assumptions of any relationship between the distribution of the fine grid array uf and the coarse grid array uc. In our code a subset of the elements of the work-array tf is copied into a subset of the array in coarse grid. Behavior of the restrict operation is illustrated in Figure 6.4.

The interpolate operation (Figure 6.5) is the opposite of restriction. It sends the correction computed on the coarse grid to the finer grid, and corrects current values on that grid. The remap and the array section expression are used to copy correction, uc, from coarse grid to matching point of a work-array, tf, on the fine grid. After the copying, we need to update boundary values using the writeHalo to get most up-to-date values. By the two nested overall constructs, it corrects current values of grid with the work-array tf. The first overall deals with fine grid points on horizontal links of the coarse grid and the second deals with those of vertical links--this is sufficient because it turns out that the correction is only needed on fine grid sites of a single parity. Behavior of the interpolate operation is illustrated in Figure 6.6.

Next: Evaluation Up: Partial Differential Equations Previous: Partial Differential Equations   Contents
Bryan Carpenter 2004-06-09