next up previous contents index
Next: Block Parameters Up: Low level SPMD programming Previous: An Example   Contents   Index

Dimension Splitting

This style goes some way toward forging a link between low-level SPMD programming and the higher level data-parallel style of HPJava, but by itself it doesn't help if we are presented with an existing one-dimensional, block-distributed array, and required to do some low-level processing on its blocks. To allow for this situation, the language is extended to support dimension splitting. Dimension splitting is introduced as an extension of the array section mechanism described at length in Chapter 4. The extra syntactic baggage is minimal, but we get a lot of new flexibility.

First we note that a particular element in a distributed array can be identified in one of two ways. It can be identified by giving a global subscript in the distributed range, which is effectively what we have done in HPJava in earlier chapters. Alternatively it can be identified by giving a process coordinate and a local subscript--a subscript within the array segment held on the associated process. Dimension splitting provides a way of accessing an element through its process coordinate and local subscript, by allowing a distributed dimension of an array to be temporarily viewed as a pair of dimensions--a coordinate dimension plus a local subscript dimension.

If the subscript in a particular dimension of a section expression is the special symbol <>, that dimension of the array is split. Whereas a triplet subscript in a section expression yields one dimension in the result array, a splitting subscript yields two--a distributed dimension and a sequential dimension. The range of the distributed dimension is the process dimension over which the original array dimension was distributed; the local blocks of the original array are embedded in the the sequential dimension of the result. The two new dimensions appear consecutively in the signature of the result array, distributed dimension first.

Now we can combine the examples from Figures 7.1 and 7.5 of the last section. The version in Figure 7.6 initally creates the arrays as distributed, one-dimensional arrays, and uses this convenient form to initialize them. It then uses a split representation of the same arrays to compute the force array. Note that, because as and fs are semantically sections of a and f, they share common elements--they provide aliases through which the same element variables can be accessed. So when the computation loop is complete, the vector of forces can be accessed again through the one-dimensional array f. This is likely to be what is needed in this case.

Figure 7.6: Version of the N-body force computation using dimension splitting.

\begin{displaymath}
\begin{minipage}[t]{\linewidth}\small\begin{verbatim}Proc...
... 0) ;
HPutil.copy(bs, tmp) ;
}
}\end{verbatim}\end{minipage}\end{displaymath}

As a similar but slightly more complicated example, Figure 7.7 contains an optimized version of the pipelined matrix multiplication from Figure 3.15. Here the arithmetic is done in local blocks by a method matmul, which implements the matrix multiplication $\mbox{\tt c} = \mbox{\tt a}
\times {\tt b}$ on sequential two-dimensional arrays. It could be written in elementary style as

\begin{displaymath}
\begin{minipage}[t]{\linewidth}\small\begin{verbatim}stat...
..., j] += a [i, k] * b [k, j] ;
}
}\end{verbatim}\end{minipage}\end{displaymath}

or it could be an optimized library routine. The operation of the parallel algorithm for $\mbox{\tt P} = 2$ is illustrated in 7.8.

Figure 7.7: Pipelined matrix multiplication program using dimension splitting.

\begin{displaymath}
\begin{minipage}[t]{\linewidth}\small\begin{verbatim}Proc...
... 1) ;
HPutil.copy(bs, tmp) ;
}
}\end{verbatim}\end{minipage}\end{displaymath}

Figure 7.8: Operation of efficient pipelined matrix computation. The matrix b is shifted one block to the right in each iteration.
\includegraphics[width=5in]{matmul1d.eps}


next up previous contents index
Next: Block Parameters Up: Low level SPMD programming Previous: An Example   Contents   Index
Bryan Carpenter 2003-04-15