|
We can see an HPJava version of red-black relaxation of the two
dimensional Laplace equation in Figure 4. Here we use a
different class of distributed range. The class ExtBlockRange adds
ghost-regions [4] to distributed arrays that use them.
A library function called Adlib.writeHalo updates the cached
values in the ghost regions with proper element values from
neighboring processes.
There are a few additional pieces of syntax here. The range of iteration of the overall construct can be restricted by adding a general triplet after the for keyword. The i` is read ``i-primed'', and yields the integer global index value for the distributed loop (i itself does not have a numeric value--it is a symbolic subscript). Finally, if the array ranges have ghost regions,
the general policy that an array subscript must be a simple distributed index is relaxed slightly--a subscript can be a shifted index, as here. The value of the numeric shift--symbolically added to or subtracted from the index--must not exceed the width of the ghost regions, and the index that is shifted must be a location in the distributed range of the array, as before.
Figure 5 gives the translation of the ``loop
nest'' from Figure 4. Here the subscripting expressions
are particularly complex, the strength-reduction is likely to
be very effective.
There is a relatively expensive method call:
|
The table shows that in this more complex example the relative performance of the optimized HPJava node code is actually somewhat better than in the previous test case--85% of straightforward C++ or Fortran. Perhaps the reference codes could be improved, but note that whenever we quote C++ or Fortran performance we are using the highest optimization level supported by the compiler (-O5). Because we are only interested in performance of the generated node code, HPJava timings omit call to Adlib.writeHalo().