next up previous
Next: An Application Up: Collective Communication for the Previous: Introduction


The HPJava Language

HPJava is a environment for parallel programming, especially suitable for programming massively parallel, distributed memory computers. HPJava is a strict extension of Java--it incorporates all of Java as a subset. For dealing with distributed arrays, it adds some pre-defined classes and some extra syntax. We aim to provide a hybrid of the data parallel and the low-level SPMD (Single Program Multiple Data) approaches through what we call the HPspmd model. So HPF-like distributed arrays appears as language primitives. In the spirit of SPMD programming, a design decision is made that all access to non-local array elements should go through calls to library functions in the source program. These library calls must be placed in the original HPJava program by the programmer. This requirement may be surprising to people used to program in high-level parallel languages like HPF, but it should not seem particularly unnatural to programmers used to writing parallel programs using MPI or other SPMD libraries. The exact nature of the communication library used is not part of the HPJava language design. An appropriate communication library might perform collective operations on whole distributed arrays (like the one described in this paper), or it might provide some kind of get and put functions for access to remote blocks of a distributed array, similar to the functions provided in the Global Array Toolkit [14], for example. A subscripting syntax can be used to directly access local elements of distributed arrays. A well-defined set of rules--automatically checked by the translator--ensures that references to these elements can only be made on processors that hold copies of the elements concerned.
Figure 1: Matrix addition using HPJava.
\begin{figure}
\small
\begin{verbatim}
Procs2 p = new Procs2(P, P) ;
on(p...
...c [i, j] = a [i, j] + b [i, j] ;
}\end{verbatim}
\normalsize
\end{figure}
Figure 1 is a simple HPJava program. It illustrates creation of distributed arrays, and access to their elements. An HPJava program is started concurrently in some set of processes which are named through ``grids'' Objects. The class Procs2 is a standard library class, and represents a two dimensional grid of processes. During the creation of $p$, $P$ by $P$ processes are selected from the active process group. The Procs2 extends the special base class Group which has a privileged status in the HPJava language. An object that inherits this class can be used in various special places. For example, it can be used to parameterize an on construct. The on(p) construct is a new control construct specifying that the enclosed actions are performed only by processes in group $p$. Range is another special class with privileged status. It represents an integer interval 0,..., $N$ - 1, distributed somehow over a process dimension (a dimension or axis of a grid like $p$). BlockRange is a particular subclass of Range. The arguments in constructor of BlockRange represent the total size of the range and the target process dimension. Thus, $x$ has M elements and distributed over first dimension of $p$ and $y$ has N elements and distributed over second dimension of $p$. The variables $a$, $b$, and $c$ are all distributed array variables. The distributed array is the most important feature HPJava adds to Java. A distributed array is a collective object shared by a number of processes. Like an ordinary array, a distributed array has some index space and stores a collection of elements of fixed type. The type signature of an $r$-dimensional distributed array involves double brackets surrounding $r$ comma-separated slots. A hyphen in one of these slots indicates the dimension is distributed. Asterisks are also allowed in these slots, specifying that some dimensions of the array are not to be distributed, i.e. they are ``sequential'' dimensions (if all dimensions have asterisks, the array is actually an ordinary, non-distributed, Fortran-like, multidimensional array--a valuable adjunct to Java in its own right, as many people have noted [13,12]). The constructors on the right hand side of the initializers specify that the arrays here all have ranges x and y--they are all M by N arrays, block-distributed over p. We see that mapping of distributed arrays in HPJava is described in terms of the two special classes Group and Range. A second new control construct, overall, implements a distributed parallel loop. It shares some characteristics of the forall construct of HPF. The symbols i and j scoped by these constructs are called distributed indexes. The indexes iterate over all locations (selected here by the degenerate interval ``:'') of ranges x and y. In HPJava the subscripts in distributed array element references must normally be distributed indexes (the only exceptions to this rule are subscripts in sequential dimensions, and subscripts in arrays with ghost regions, discussed later). The indexes must be in the distributed range associated with the array dimension. This strict requirement ensures that referenced array elements are held by the process that references them.
Figure 2: Solution of Laplace equation by Jacobi relaxation.
\begin{figure}
\small
\begin{verbatim}
Procs2 p = new Procs2(P, P) ;
on(p...
...
} while(Adlib.maxval(r) > EPS);
}\end{verbatim}
\normalsize
\end{figure}
Figure 2 is a HPJava program for the Laplace program that uses ghost regions. It illustrates the use of the standard library class ExtBlockRange to create arrays with ghost extensions. The distributed range class ExtBlockRange is a library class derived from the special class Range, distributing with block distribution format with ghost extensions. In this case, the extensions are of width 1 on either side of the locally held ``physical'' segment. Figure 3 illustrates this situation.
Figure 3: Example of a distributed array with ghost regions.
3in2.5in./ghost.eps
From the point of view of this paper the most important feature of this example is the appearance of the function Adlib.writeHalo(). This is a collective communication operation used to fill the ghost cells or overlap regions surrounding the "physical" segment of a distributed array. A call to a collective operation must be invoked simultaneously by all members of some active process group (which may or may not be the entire set of processes executing the program). The effect of writeHalo is to overwrite the ghost region with values from processes holding the corresponding elements in their physical segments. Figure 4 illustrates the effect of executing the writeHalo function. More general forms of writeHalo may specify that only a subset of the available ghost area is to be updated, or may select cyclic wraparound for updating ghost cells at the extreme ends of the array. If an array has ghost regions the rule that the subscripts must be simple distributed indices is relaxed; shifted indices, including a positive or negative integer offset, allow access to elements at locations neighboring the one defined by the overall index.
Figure 4: Illustration of the effect of executing the writeHalo function.
2.5in2.5in./ghost1.eps
The final component of the basic HPJava syntax that we will discuss here is support for Fortran-like array sections. An array section expression has a similar syntax to a distributed array element reference, but uses double brackets. It yields a reference to a new array containing a subset of the elements of the parent array. Those elements can subsequently be accessed either through the parent array or through the array section--HPJava sections behave something like array pointers in Fortran, which can reference an arbitrary regular section of a target array. As in Fortran, subscripts in section expressions can be index triplets. HPJava also has built-in ideas of subranges and restricted groups. These describe the range and distribution group of sections, and can be also used in array constructors on the same footing as the ranges and grids introduced earlier. They allow HPJava arrays to reproduce any mapping allowed by the ALIGN directive of HPF. The examples here have covered the basic syntax of HPJava. The language itself is relatively simple. Complexities associated with varied or irregular patterns of communication are supposed to be dealt with in communication libraries like the ones discussed in the remainder of this paper.
next up previous
Next: An Application Up: Collective Communication for the Previous: Introduction
Bryan Carpenter 2003-01-23