HPJava follows HPF in allowing several different distribution formats for the dimensions of its distributed arrays. The new formats are provided without further extension to the syntax of the language. Instead the Range class hierarchy is extended. The full hierarchy is shown in Figure 3.1.
The BlockRange subclass should be quite familiar by now. The Dimension class is also familiar, although previously it wasn't presented as a range class--later examples will demonstrate how it can be convenient to use process dimensions directly as array ranges. CyclicRange is directly analogous to the cyclic distribution format available in HPF.
Cyclic distributions are sometimes favored because they can lead to
better load balancing than the simple block distributions
introduced so far. Some algorithms (for example dense matrix
algorithms) don't have the kind of locality that favours block
distribution for stencil updates, but they do involve phases where
parallel computations are limited to subsections of the whole
array. In block distributions these sections may map to
only a fraction of the available processes, leaving the remaining
processes idle. Here is a contrived example
In cyclic distribution format, the
index space is mapped to the process dimension in wraparound
fashion. If we change our example to
As a more graphic example of how cyclic distribution can improve load balancing, consider the Mandelbrot set code of Figure 3.4.
The Mandelbrot set is defined in terms of points in a plane (actually the plane of complex numbers). Mathematically, the set is a subset of the points in the plane--those points for which a certain iteration does not converge. Figure 3.4 runs this iteration at each point of a grid embedded in the plane. If the iteration of the while loop terminates normally, the current point is not in the set. If the while loop would carry on indefinitely, the current point is in the set. In practice one approximates the set as being those points for which the while loop iterates a sufficiently large number of times--CUTOFF should be some moderately large integer.
We represent the set as a distributed array of char. On completion
points outside the set have value
' ' and point inside the
set have value
'#'. This representation is chosen just because
it makes a recognizable display when the array is printed out with
Points away from the set are generally eliminated in a few iterations, whereas those inside the set or close to it take much more computation. In the case N = 64 (with CUTOFF = 100), the block decomposition of the set is shown in Figure 3.5. The middle column of processors will do most of the work, because they hold most of the set. Changing the class of the ranges to CyclicRange gives the much more even distribution shown in Figure 3.6.
We are not suggesting, by the way, that Mandelbrot computation is a serious application of HPJava. It is just a meant as a simple illustration of an issue about distribution format and load balancing.
The ExtBlockRange subclass represents block-distributed ranges extended with ghost regions.