Although CM Fortran looked syntactically like standard Fortran, the programmer had to be aware of many nuances. Like the ILLIAC IV, the Connection Machine allowed Fortran arrays to either be distributed across the processing nodes (called CM arrays, or distributed arrays), or allocated in memory of the front-end computer (called front-end arrays, or sequential arrays). Unlike the control unit of the ILLIAC, the Connection Machine front-end was a conventional, general-purpose computer--typically a VAX or Sun. But there were still significant restrictions on how arrays could be manipulated, reflecting the two possible homes.
As noted above, the shape of a distributed array in CM Fortran was not constrained by the number of physical processing elements in the parallel machine. The new freedom was achieved as follows. First a Virtual Processor (VP) grid was defined. By default it would have some minimal shape such that
REAL A (10, 64, 100)
the compiler might have chosen a VP set with shape (16,64,128).
The total number of VPs is 128K--exactly twice the number of physical PEs,
assuming one was targetting a 64K processor CM 2. Two virtual
processors from the VP set would be assigned to each physical PE8.
Now the array was mapped into the VP set. The array element at the lower
bounds of the dimensions, for example array element (1,1,...),
was placed in processor (0,0,...). The extra processors that
pad the VP set out to the next legal size were deactivated or ignored
during operations on the array. Typically many VP sets would
have coexisted in a given program--one for each distinct array shape
occuring in the program.
In the absence of special syntax to specify the home of an array in its declaration, CM Fortran adopted the pragmatic convention that any array used anywhere in a Fortran 90 whole-array operation was a CM array. An array that was only ever accessed through Fortran 77 scalar element references was a front-end array. This convention was applied on a per-procedure basis, so the compiler would only have to scan the body of a single procedure to determine if a locally-used array was to be treated as a CM array or a front-end array.
This convention had drawbacks. A CM array did not obey the usual Fortran rules on sequence and storage association. For example, it could not appear in an EQUIVALENCE statement, and the old Fortran 77 ways of passing parts of arrays to procedures or reshaping arrays across the procedure boundaries did not work for CM arrays. Moreover it was illegal to pass a CM array as an actual argument to a procedure that expected a front-end array as its dummy, or vice versa. So the legality of a procedure call might depend not only on the interface of the procedure and the form of the CALL statement, but also on whether caller or callee happened to use the argument in a Fortran-90 array assignment anywhere in their respective bodies--presumably an error-prone arrangement.
The LAYOUT compiler directive allowed the programmer to explicitly specify the home of an array. The LAYOUT directive also gave the programmer some more refined control over the layout of a distributed array, overriding the default scheme. In general a dimension of a distributed array could be distributed over a dimension of a VP set, or it could be serial--mapped into the memory of a single processing element. Of course similar options had been available in earlier data-parallel languages including CFD and DAP FORTRAN, though CM FORTRAN was a little more flexible in how serial dimensions could be interspersed with parallel dimensions. In
REAL A (10, 64, 100)
CMF$ LAYOUT A(:SERIAL, :NEWS, :NEWS)
the directive specifies that A is to be laid out with one serial
dimension and two parallel (``NEWS-ordered'') dimensions. By convention
if an array was specified to have all serial dimensions, it was allocated
on the front-end. LAYOUT directives could appear in a Fortran 90
interface specification for a procedure, providing one way to
avoid the pitfalls described in the last paragraph.
Two arrays with the same extents and layouts for all their parallel dimensions have corresponding elements aligned. For example in
REAL A (10, 64, 100)
REAL B (64, 100)
CMF$ LAYOUT A(:SERIAL, :NEWS, :NEWS)
CMF$ LAYOUT B(:NEWS, :NEWS)
both arrays are placed in the same VP set, and the elements
A(:,i,j) are resident on the same processor as the elements
B(i,j). Alignment relations like this are extremely important
in practise. In constructing practical parallel algorithms, programmers
must be very aware of alignment relations between data structures,
and should exploit them to minimize communication overheads.
Accordingly CM Fortran provided a second directive to explicitly
specify that elements of a pair of arrays be aligned.
REAL V(100), B(64, 100)
CMF$ ALIGN V(I) WITH B(1, I)
This forces the elements of V to be aligned with the first
row of B. Note this layout for V that cannot be
obtained with the LAYOUT directive alone, because it distributes the
elements of a one-dimensional array over a two-dimensional VP grid.
Even more complex alignment patterns were allowed
REAL C(32,50)
CMF$ ALIGN C(I,J) WITH B(I+5, J+2)
These situations are visualized in Figure 2.
There were well-defined limits to the complexity of alignment relations.
Transposed alignments, for example, were not allowed--alignment dummies
like I, J must appear in the same order in the alignee and
alignment target dimensions.