next up previous
Next: Improving the performance Up: Translation scheme Previous: Java packages for HPspmd


Programming in the adJava interface

In this section we illustrate through an example--Fox's algorithm [11] for matrix multiplication--how to program in the adJava interface. We assume $A$ and $B$ are square matrices of order $n$, so $C=AB$ is also a square matrix of order $n$. Fox's algorithm organizes $A$, $B$ and $C$ into sub-matrices on a $P$ by $P$ process array. It takes $P$ steps. In each step, a sub-matrix of $A$ is broadcast across each row of the processes, a local block matrix product is computed, and array $B$ is shifted for computation in the next step.

We can program this algorithm in HPJava, using Adlib.remap to broadcast submatrices, Adlib.shift to shift array $B$, and Adlib.copy to copy data back after shifting. The HPJava program is given in figure 4. The subroutine matmul for local matrix multiplication will be given in the next section.

This HPJava program is slightly atypical: it uses arrays distributed explicitly over process dimensions, rather than using higher-level ranges such as BlockRange to describe the distribution of the arrays. Hence, two-dimensional matrices are represented as four dimensional arrays with two distributed ranges (actually process dimensions) and two collapsed ranges (spanning the local block). This simplifies the initial discussion.

Figure 4: Algorithm for matrix multiplication in HPJava
\begin{figure}\center{
\footnotesize\begin{verbatim}Procs2 p = new Procs2(P,P...
...b' in first dim, amount 1
Adlib.copy(b, tmp);
}
}\end{verbatim}}
\end{figure}

We can rewrite the program in pure Java language using our adJava interface. A translation is given in figure 5. This is an executable Java program. One can use (for example) mpirun to start Java virtual machines on $P^2$ processors and let them simultaneously load the Fox class. This naive translation uses for loops plus at constructs to simulate the overall constructs. The function pairs on,no and at,ta adjust the field spmd.apg, which records the current active process group. The dynamic alteration of this group plays an non-trivial role in this program. The call to remap implements a broadcast because the temporary sub is replicated over the process group active at it's point of declaration. Within the overall(i = x) construct, the locally effective APG is a row of the process grid. The rather complex code for section construction exposes various low-level inquiries (and one auxilliary class, Map) from the adJava runtime. The details are not particulary important here.

Figure 5: Algorithm for matrix multiplication in adJava
\begin{figure}\center{
\scriptsize\begin{verbatim}class Fox {
final static in...
... first dim, amount 1
Adlib.copy(b, tmp);
}
}
}
}\end{verbatim}}
\end{figure}


next up previous
Next: Improving the performance Up: Translation scheme Previous: Java packages for HPspmd
Bryan Carpenter 2002-07-11