Next: Distributed Array Sections Up: More on Mapping Arrays Previous: Collapsed Distributions and Sequential   Contents   Index

Distribution Groups and Replication

Allowing collapsed array dimensions means that an array can be distributed over a process grid having smaller rank than the array itself. Conversely it is also allowed to distribute an array over a process grid whose rank is larger than the array.

The array b has a dimension distributed over the first dimension of p, but none distributed over the second. The interpretation is that b is replicated over the second process dimension. Independent copies of the whole array are created at each coordinate where replication occurs. Usually programs maintain identical values for the elements in each copy (although there is nothing in the language definition itself to require this).

Replication and collapsing can both occur in a single array, for example

The range of c is sequential, and the array is replicated over both dimensions of p. This makes it essentially the same as the kind of sequential multiarray described in Chapter 1, although actually there is a subtle distinction between a multiarray created in HPspmd code and a multiarray created in non-HPspmd code. A multiarray created in HPspmd code always has a well-defined distribution group. For c this distribution group is p. A multiarray created in non-HPspmd code does not have a distribution group (in fact the way this is implemented is that the non-HPspmd multiarray has distribution group of null).

In the last section we gave a pipelined'' matrix multiply algorithm. A simpler and potentially more efficient implementation of matrix multiplication can be given if the operand arrays have carefully chosen replicated/collapsed distributions. The program is given in Figure 3.16. As illustrated in Figure 3.17, the rows of a are replicated in the process dimension associated with y. Similarly the columns of b are replicated in the dimension associated with x. Hence all arguments for the inner scalar product are already in place for the computation--no communication is needed.

We would be very lucky to come across three arrays with such a special alignment relation (distribution format relative to one another). There is an important function in the Adlib library called remap(), which takes a pair of arrays as arguments. These must have the same shape and type, but they can have unrelated distribution formats. The elements of the source array are copied to the destination array. In particular, if the destination array has a replicated mapping, the values in the source array are broadcast appropriately.

Figure 3.18 shows how we can use remap to adapt the program in Figure 3.16 and create a general purpose matrix multiplication routine. Besides the remap function, this example introduces two inquiries grp() and rng() that are defined for any distributed array. The inquiry grp() returns the distribution group of the array, and the inquiry returns the th range of the array. The argument is in the range , where is the rank (dimensionality) of the array.

The example also illustrates the most general form of the distributed array creation expression. In all earlier examples arrays were distributed over the whole of the active process group, defined by an enclosing on construct. In general an on clause attached to an distributed array creation expression itself itself can specify that the array is distributed over some subset of the active group. This allows one to create an array outside the on construct that will processes its elements. Through communication functions like remap, values can then be exchanged between different process grids. A sketch example is given in Figure 3.19.

To allow for this kind of situation, where arguments might be distributed over distinct process groups--not the active process group--the generic matrix multiplication of Figure 3.18 included on p clauses in the constructors of its temporary arrays, and explicitly restricts control with an on(p) construct before processing. As we will see in the next section, this refinement also allows the arguments of matmul to be arbitrary array sections.

A special case of the pattern of multiple grids singles out one process of the program as a control'' processor, responsible for things like I/O. This can be done by creating a Procs0 singleton grid:

Code that is only to be executed by the control process is done in a suitable on(control) construct. For example input or output of distributed data can be done in such a construct, reading to or writing from an array with sequential ranges, and distribution group control. This array can then be remapped to or from a real'' distributed array with distributed ranges.

Next: Distributed Array Sections Up: More on Mapping Arrays Previous: Collapsed Distributions and Sequential   Contents   Index
Bryan Carpenter 2003-04-15