next up previous contents
Next: FORALL Up: Connection Machine Fortran Previous: Fortran 90 array processing   Contents

Array mapping

Although CM Fortran looked syntactically like standard Fortran, the programmer had to be aware of many nuances. Like the ILLIAC IV, the Connection Machine allowed Fortran arrays to either be distributed across the processing nodes (called CM arrays, or distributed arrays), or allocated in memory of the front-end computer (called front-end arrays, or sequential arrays). Unlike the control unit of the ILLIAC, the Connection Machine front-end was a conventional, general-purpose computer--typically a VAX or Sun. But there were still significant restrictions on how arrays could be manipulated, reflecting the two possible homes.

As noted above, the shape of a distributed array in CM Fortran was not constrained by the number of physical processing elements in the parallel machine. The new freedom was achieved as follows. First a Virtual Processor (VP) grid was defined. By default it would have some minimal shape such that

  1. the VP grid was large enough to contain the desired array shape,
  2. the extents of the grid were powers of two, and
  3. the total number of VPs was an exact multiple of the number of physical PEs.
For example if the array was declared
      REAL A (10, 64, 100)
the compiler might have chosen a VP set with shape (16,64,128). The total number of VPs is 128K--exactly twice the number of physical PEs, assuming one was targetting a 64K processor CM 2. Two virtual processors from the VP set would be assigned to each physical PE8. Now the array was mapped into the VP set. The array element at the lower bounds of the dimensions, for example array element (1,1,...), was placed in processor (0,0,...). The extra processors that pad the VP set out to the next legal size were deactivated or ignored during operations on the array. Typically many VP sets would have coexisted in a given program--one for each distinct array shape occuring in the program.

In the absence of special syntax to specify the home of an array in its declaration, CM Fortran adopted the pragmatic convention that any array used anywhere in a Fortran 90 whole-array operation was a CM array. An array that was only ever accessed through Fortran 77 scalar element references was a front-end array. This convention was applied on a per-procedure basis, so the compiler would only have to scan the body of a single procedure to determine if a locally-used array was to be treated as a CM array or a front-end array.

This convention had drawbacks. A CM array did not obey the usual Fortran rules on sequence and storage association. For example, it could not appear in an EQUIVALENCE statement, and the old Fortran 77 ways of passing parts of arrays to procedures or reshaping arrays across the procedure boundaries did not work for CM arrays. Moreover it was illegal to pass a CM array as an actual argument to a procedure that expected a front-end array as its dummy, or vice versa. So the legality of a procedure call might depend not only on the interface of the procedure and the form of the CALL statement, but also on whether caller or callee happened to use the argument in a Fortran-90 array assignment anywhere in their respective bodies--presumably an error-prone arrangement.

The LAYOUT compiler directive allowed the programmer to explicitly specify the home of an array. The LAYOUT directive also gave the programmer some more refined control over the layout of a distributed array, overriding the default scheme. In general a dimension of a distributed array could be distributed over a dimension of a VP set, or it could be serial--mapped into the memory of a single processing element. Of course similar options had been available in earlier data-parallel languages including CFD and DAP FORTRAN, though CM FORTRAN was a little more flexible in how serial dimensions could be interspersed with parallel dimensions. In

     REAL A (10, 64, 100)
the directive specifies that A is to be laid out with one serial dimension and two parallel (``NEWS-ordered'') dimensions. By convention if an array was specified to have all serial dimensions, it was allocated on the front-end. LAYOUT directives could appear in a Fortran 90 interface specification for a procedure, providing one way to avoid the pitfalls described in the last paragraph.

Two arrays with the same extents and layouts for all their parallel dimensions have corresponding elements aligned. For example in

     REAL A (10, 64, 100)
     REAL B (64, 100)
both arrays are placed in the same VP set, and the elements A(:,i,j) are resident on the same processor as the elements B(i,j). Alignment relations like this are extremely important in practise. In constructing practical parallel algorithms, programmers must be very aware of alignment relations between data structures, and should exploit them to minimize communication overheads. Accordingly CM Fortran provided a second directive to explicitly specify that elements of a pair of arrays be aligned.
     REAL V(100), B(64, 100)
This forces the elements of V to be aligned with the first row of B. Note this layout for V that cannot be obtained with the LAYOUT directive alone, because it distributes the elements of a one-dimensional array over a two-dimensional VP grid. Even more complex alignment patterns were allowed
     REAL C(32,50)
These situations are visualized in Figure 2. There were well-defined limits to the complexity of alignment relations. Transposed alignments, for example, were not allowed--alignment dummies like I, J must appear in the same order in the alignee and alignment target dimensions.

Figure 2: Array layouts and alignments in CM Fortran.

next up previous contents
Next: FORALL Up: Connection Machine Fortran Previous: Fortran 90 array processing   Contents
Bryan Carpenter 2002-07-12