next up previous
Next: Summary Up: Programming examples in HPJava Previous: Choleski decomposition

Fox's algorithm for matrix multiplication

Recall that if $A$ and $B$ are square matrix of order $n$, then $C=AB$ is also a square matrix of order $n$, and $c_{ij}$ is obtained by taking the dot product of the $i$th row of $A$ and $j$th column of $B$. That is,


Fox's algorithm [3] for multiplication organize $A$, $B$ and $C$ into submatrix on a P by P process array. It take $P$ steps, in each step, broadcast corresponding submatrix of $A$ on each row of the processes, do local computation and then shift array $B$ for the next step computation. Write this algorithm in HPJava, we still use Adlib.remap to broadcast submatrix, matmul is a subroutine for local matrix multiplication. Adlib.shift is used to shift array $B$, and Adlib.copy copies data back after shift, it can also be implemented as nested over and for loops.

  Group p=new Procs2(P,P);

  Range x=p.dim(0);
  Range y=p.dim(1);

  on(p) {
    float [[#,#, , ]] a = new float [[x,y,B,B]]; 
    float [[#,#, , ]] b = new float [[x,y,B,B]]; 
    //input a, b here;

    float [[#,#, , ]] c = new float [[x,y,B,B]]; 
    float [[#,#, , ]] temp = new float [[x,y,B,B]]; 
    for (int k=0; k<p; k++) {
      over(Location i=x|:) {
        float [[,]] sub = new float [[B,B]];    

        //Broadcast submatrix in 'a' ...
        Adlib.remap(sub, a[[i, (i+k)%P, z, z]]);

        over(Location j=y|:) { 
          //Local matrix multiplication ...
          matmul(c[[i, j, z, z]], sub, b[[i, j, z, z]]);
      //Cyclic shift 'b' in 'y' dimension ...
      Adlib.shift(tmp, b, 1, 0, 0); // dst, src, shift, dim, mode;
      Adlib.copy(b, tmp);

Bryan Carpenter 2002-07-12