next up previous
Next: Variable usage rules Up: Semantic Checking in HPJava Previous: HPJava an HPspmd language

Array sections

HPJava has a syntax for representing subarrays. An array section expression has a similar syntax to a distributed array element reference, but uses double brackets. Whereas an element reference is a variable, an array section is an expression representing a new distributed array object. The new array contains a subset of the elements of the parent array. Those elements can subsequently be accessed either through the parent array or through the array section. The HPJava implementation of an array section is closely related to the Fortran 90 idea of an array pointer--in Fortran an array pointer can reference an arbitrary regular section of a target array.

In the previous section it was stated that the subscripts in a distributed array element reference are either locations or (restrictedly) integer expressions. Options for subscripts in array section expressions are freer. For example, a section subscript is allowed be a triplet. In the simplest kinds of array section the rank of the result is equal to the number of triplet subscripts. If the section also has some scalar subscripts, this will be lower than the rank of the parent array. We would like to be able define the mapping of an arbitrary array section.

The mapping of a distributed array is defined by its distribution group and its list of ranges. In earlier examples the distribution group of arrays defaulted to the process group specified by the surrounding on construct. In general the distribution group can be specified explicitly by appending an ``on'' clause to the constructor itself:

    new type [[$x_0$, $x_1$, ...]] on $p$
Here each of $x_0, x_1, \ldots$ is a range object or an integer (in which case the dimension is sequential), and $p$ is a group contained within the set of processes that create the array. The ranges must be distributed over distinct dimensions of $p$. If there is any dimension of $p$ which is not a distribution target of some range from $x_0, x_1, \ldots$, the array is replicated over that process dimension. The inquiries grp and rng(int r) return the group and ranges for any array2.

Figure 4: A two-dimensional section of a two-dimensional array (shaded area).

Now consider array section b defined by

    float [[,]] a = new float [[x, y]] on p ;

    float [[,]] b = a [[0 : N - 1 : 2, 0 : N / 2 - 1]] ;
(see Figure 4). What are the ranges of b? In fact they are new kind of range called a subrange. For completeness there is a special syntax for constructing subranges directly:
    Range u = x [0 : N - 1 : 2] ;
    Range v = y [0 : N / 2 - 1] ;
The extents of both u and v are N / 2.

The distribution group of b can be identified with the distribution group of the parent array a. But sections constructed using a scalar subscript, eg

    float [[]] c = a [[0, :]]
(see Figure 5) present another problem. Clearly c.rng(0) is y. But identifying the distribution group of c with p is not right. It would imply that c is replicated over the first dimension of p. Where does the information that c is localized to the top row of processes go?

Figure 5: A one-dimensional section of a two-dimensional array (shaded area).

We are driven to define a new kind of group: a restricted group is the subset of processes in some parent group to which a particular location is be mapped. The distribution group of c is defined to be the subset of processes in p to which the location x[0] is mapped. It can be written explicitly as3

    p / x [0]
An equivalent definition of a restricted group is as some slice of a process grid, chosen by restricting some of the coordinates to single values.

The idea of a restricted group may look slightly ad hoc, but the implementation is quite elegant. A restricted group is uniquely specified by its set of effective process dimensions and the identity of the lead process in the group--the process with coordinate zero relative to the effective dimensions. The dimension set can be specified as a subset of the dimensions of the parent grid using a simple bitmask. The identity of the lead process can be specified through a single integer ranking the processes of the parent grid. So a general (restricted) HPJava group can be fully parametrized by a reference to the parent process grid together with just two int fields.

Now we can formally define of the mapping of a typical array section. As a matter of definition an integer subscript $n$ in dimension $r$ of array $a$ is equivalent to a location-valued subscript $a$.rng($r$)[$n$]. By definition, a triplet subscript $l$:$u$:$s$ in the same dimension is equivalent to range-valued subscript $a$.rng($r$)[$l$:$u$:$s$]. If all integer and triplet subscripts in a section are replaced by their equivalent location or range subscripts, and the location-valued subscripts are $i, j, \ldots$, then the distribution group of the section is

    $a$.grp() / $i$ / $j$ / ...
and the $s$th range of the section is equal to the $s$th range-valued subscript.

Subranges and restricted groups can be used in array constructors on the same footing as the ranges and grids introduced earlier. This enables HPJava arrays to reproduce any alignment option allowed by the ALIGN directive of HPF.

next up previous
Next: Variable usage rules Up: Semantic Checking in HPJava Previous: HPJava an HPspmd language
Bryan Carpenter 2002-07-11