Next: An Application
Up: Collective Communication for the
HPJava is a environment for parallel programming, especially suitable for
programming massively parallel, distributed memory computers.
HPJava is a strict extension of Java--it incorporates all of Java as a subset.
For dealing with distributed arrays, it adds some pre-defined classes and
some extra syntax. We aim to provide a hybrid of the data parallel and the
low-level SPMD (Single Program Multiple Data) approaches through
what we call the HPspmd model.
So HPF-like distributed arrays appears as language primitives.
In the spirit of SPMD programming, a design decision is made that all access
to non-local array elements should
go through calls to library functions in the source program.
These library calls must be placed in the original HPJava program by the
programmer. This requirement may be
surprising to people used to program in high-level parallel languages
like HPF, but it should not seem particularly unnatural to programmers
used to writing parallel programs using MPI or other SPMD
libraries. The exact nature of the communication library used is not part of
the HPJava language design. An appropriate communication library might perform
collective operations on whole distributed arrays (like the one described
in this paper), or it might provide some kind of get and put
functions for access to remote blocks of a distributed array, similar to the
functions provided in the Global Array Toolkit ,
A subscripting syntax can be used to directly access local elements of
distributed arrays. A well-defined set of rules--automatically checked by the
translator--ensures that references to these elements can only be made on
processors that hold copies of the elements concerned.
The HPJava Language
Figure 1 is a simple HPJava program. It illustrates
creation of distributed arrays, and access to their elements. An
HPJava program is started concurrently in some set of processes which
are named through ``grids'' Objects. The class Procs2 is a standard
library class, and represents a two dimensional grid of processes.
During the creation of , by processes are selected from the
active process group. The Procs2 extends the special base
class Group which has a privileged status in the HPJava
language. An object that inherits this class can be used in various
special places. For example, it can be used to parameterize an on
construct. The on(p) construct is a new control construct
specifying that the enclosed actions are performed only by processes in
Range is another special class with privileged status. It
represents an integer interval 0,..., - 1, distributed somehow over
a process dimension (a dimension or axis of a grid like ).
BlockRange is a particular subclass of Range. The
arguments in constructor of BlockRange represent the total size
of the range and the target process dimension. Thus, has M
elements and distributed over first dimension of and has
N elements and distributed over second dimension of .
The variables , , and are all distributed array variables.
The distributed array is the most important feature HPJava adds to Java.
A distributed array is a collective object shared by a number of processes.
Like an ordinary array, a distributed array has some index space and stores
a collection of elements of fixed type.
The type signature of an -dimensional distributed array involves
double brackets surrounding comma-separated slots.
A hyphen in one of these slots indicates the dimension is distributed.
Asterisks are also allowed in these slots, specifying that
some dimensions of the array are not to be distributed, i.e. they are
``sequential'' dimensions (if all dimensions have asterisks,
the array is actually an ordinary, non-distributed, Fortran-like,
multidimensional array--a valuable adjunct to Java in its own right,
as many people have noted [13,12]).
The constructors on the right hand side of the initializers specify that
the arrays here all have ranges x and y--they are all M by
N arrays, block-distributed over p. We see that mapping of
distributed arrays in HPJava is described in terms of the two special
classes Group and Range.
A second new control construct, overall, implements a distributed
parallel loop. It shares some characteristics of the forall
construct of HPF. The symbols i and j scoped by these
constructs are called distributed indexes. The indexes iterate
over all locations (selected here by the degenerate interval ``:'') of
ranges x and y.
In HPJava the subscripts in distributed array element references must
normally be distributed indexes (the only exceptions to this rule are
subscripts in sequential dimensions, and subscripts in arrays with
ghost regions, discussed later). The indexes must be in the
distributed range associated with the array dimension. This strict
requirement ensures that referenced array elements are held by the
process that references them.
Matrix addition using HPJava.
Figure 2 is a HPJava program for the Laplace program
that uses ghost regions. It illustrates the use of the standard
library class ExtBlockRange to create arrays with ghost
extensions. The distributed range class ExtBlockRange is a
library class derived from the special class Range, distributing
with block distribution format with ghost extensions. In this case,
the extensions are of width 1 on either side of the locally held
``physical'' segment. Figure 3 illustrates this situation.
Solution of Laplace equation by Jacobi relaxation.
From the point of view of this paper the most important feature of this
example is the appearance of the function Adlib.writeHalo().
This is a collective communication operation used to fill the
ghost cells or overlap regions surrounding the "physical"
segment of a distributed array. A call to a collective operation
must be invoked simultaneously by all members of some active process
group (which may or may not be the entire set of processes executing
the program). The effect of writeHalo is to overwrite the ghost
region with values from processes holding the corresponding elements
in their physical segments. Figure 4 illustrates the effect
of executing the writeHalo function. More general forms of
writeHalo may specify that only a subset of the available ghost area
is to be updated, or may select cyclic wraparound for updating ghost
cells at the extreme ends of the array.
If an array has ghost regions the rule that the subscripts must
be simple distributed indices is relaxed; shifted indices, including
a positive or negative integer offset, allow access to elements at
locations neighboring the one defined by the overall index.
Example of a distributed array with ghost regions.
The final component of the basic HPJava syntax that we will discuss
here is support for Fortran-like array sections. An array section
expression has a similar syntax to a distributed array element
reference, but uses double brackets. It yields a reference to a new array
containing a subset of the elements of the parent array. Those elements
can subsequently be accessed either through the parent array or through
the array section--HPJava sections behave something like array pointers
in Fortran, which can reference an arbitrary regular section of a target
array. As in Fortran, subscripts in section expressions can be index
triplets. HPJava also has built-in ideas of subranges and
restricted groups. These describe the range and distribution group of
sections, and can be also used in array constructors on the same footing
as the ranges and grids introduced earlier. They allow HPJava arrays
to reproduce any mapping allowed by the ALIGN directive of HPF.
The examples here have covered the basic syntax of HPJava. The language
itself is relatively simple. Complexities associated with varied or
irregular patterns of communication are supposed to be dealt with in
communication libraries like the ones discussed in the remainder of
Illustration of the effect of executing the writeHalo
Next: An Application
Up: Collective Communication for the