next up previous contents
Next: ICL DAP Fortran Up: Early data-parallel languages Previous: Early data-parallel languages   Contents


In the early 60s Daniel Slotnick worked at Westinghouse on a project called Solomon. If it had been completed, the project would have produced the first massively parallel computer. In fact the project folded, and Slotnick moved to the University of Illinois and started the ILLIAC IV project with Burroughs, in the latter half of the 60s.

Amongst its technological innovations, ILLIAC IV was the first large system to employ semiconductor primary memory. Development of the system was a very large project for a university, with a final cost of around $30 million. This cost meant the machine had to to be made available to a large community of users, and Slotnick observes [10] that development of ILLIAC IV and ARPANET (which ultimately evolved into the Internet) became closely linked. After many problems, including political instabilities associated with the Vietnam war that caused the hardware to be transferred from its campus base to NASA Ames, the first successful runs occurred in 1973. The machine was not fully operational until 1975. Between that time and 1981 it was the world's fastest computer.

The ILLIAC IV was a SIMD computer for array processing. It consisted of a control unit (CU) and 64 processing elements (PEs). Each processing element had two thousand (2K) 64-bit words of memory associated with it. The CU could access all 128K words of memory through a bus, but each PE could only directly access its local memory. An 8 by 8 grid interconnect joined each PE to 4 neighbours. The CU interpreted program instructions scattered across the memory, and broadcast them to the PEs (Figure 1). Neither the PEs nor the CU were general-purpose computers in the modern sense--the CU had quite limited arithmetic capabilities.

Figure 1: Architecture of the ILLIAC IV. CU = Control Unit, PE = Processing Element, PEM = PE Memory module. The routing network also directly connected PEs at distance 8.

In general software was problematic for the ILLIAC IV. Various ambitious, high-level programming languages were proposed. We will discuss in detail one of the more pragmatic proposals, which seems to have been successfully implemented and used. This language made the architectural features of the ILLIAC IV very apparent to the programmer, but it also contained the seeds of some practical programming language abstractions for data-parallel programming.

CFD [11] was a language developed in the early 70s at the Computational Fluid Dynamics Branch of Ames Research Center. CFD was a ``FORTRAN-like'' language, rather than a FORTRAN dialect in the modern sense (it did not strictly include normal FORTRAN as a subset). The language design was extremely pragmatic. No attempt was made to hide the hardware peculiarities from the user; in fact, every attempt was made to give programmers access and control of all of the Illiac hardware so they could construct an efficient programs.

CFD had five basic datatypes: CU INTEGER, CU REAL, CU LOGICAL, PE REAL, and PE INTEGER. The type of a variable statically encoded its home: either on the control unit or on the processing elements. Apart from restrictions on their home, the two INTEGER and REAL types behave like the corresponding types in ordinary FORTRAN. The CU LOGICAL type was more idiosyncratic--it had 64 independent bits that acted as flags controlling activity of the PEs. We will not discuss it further here.

Scalars and arrays of the five types could be declared as in FORTRAN. An ordinary variable or array of type CU REAL, for example, would be allocated in the (very small) control unit memory. An ordinary variable or array of type PE REAL would be allocated somewhere in the collective memory of the processing elements (accessible by the control unit over the data bus). Here are some ordinary variable declarations:

      CU REAL A, B(100)
      PE REAL D(25), E(1000)
Apparently both the CU and PE variables here could be manipulated in ordinary scalar assignments, executed directly by the control unit. There were ad hoc restrictions on what kind of arithmetic operations could appear in such assignments, reflecting the limited processing power of the control unit.

The last data structure available in CFD was a new kind of array called a vector-aligned array. This was a very early language instantiation of the distributed array concept, which will be a focus of these lectures.

Compared with the distributed arrays of later languages, the CFD vector-aligned array was a primitive structure. Only the first dimension could be distributed, and the extent of that dimension had to be exactly 64. A vector-aligned array would be of PE INTEGER or PE REAL type, and the syntax for the distributed dimension involved an asterisk:

      PE INTEGER J(*)
      PE REAL X(*,4), Y(*,2,8)
These are parallel arrays. J(1) is stored on the first PE, J(2) is stored on the second PE, and so on. Similarly X(1,1), X(1,2), X(1,3), X(1,4) are stored on PE 1, X(2,1), X(2,2), X(2,3), X(2,4) are stored on PE 2, etc.

Parallel computation occurred only in vector assignments. The left-hand-side of the assignment had to be a vector-aligned array, with first subscript *. The right-hand-side could be a mix of scalar and vector expressions, with scalar expressions broadcast to all PEs.

A vector expression was a vector-aligned array with a * subscript in the first dimension. Communication between neighbouring PEs was captured by allowing the * to have some shift added, as in:

      DIFP(*) = P(* + 1) - P(* - 1)
All shifts were cyclic (end-around) shifts, so this parallel statement is equivalent to the sequential statements
      DIFP(1)  = P(2) - P(64)
      DIFP(2)  = P(3) - P(1)
      DIFP(64) = P(1) - P(63)

Essential flexibility was added by allowing vector assignments to be executed conditionally with a vector test, eg:

      IF(A(*) .LT. 0)  A(*) = -A(*)
Less structured methods of masking operations by explicitly assigning PE activity flags in CU LOGICAL variables were also available; there were special primitives for restricting activity to simply-specified ranges of PEs. PEs could concurrently access different addresses in their local memory by using vector subscripts:
      DIAG(*) = RHO(*, X(*))
Vector subscripts could only be used in ``sequential'' dimensions--not in the distributed, first dimension. The language had no built-in syntax for randomly reshuffling of data around the PE array.

In spite of (perhaps even because of) its simplicity and ad hoc machine-dependencies, CFD allowed researchers at Ames to develop a range of application programs that efficiently used the ILLIAC IV. It was thus quite successful.

Another important early language on the ILLIAC was Glypnir [5]. Glypnir was Algol-like rather than FORTRAN-like. It was also quite machine specific and had some similarities to CFD. The distributed array idea was not so explicit--instead there was a concept of a sword, which was more suggestive the pvar in the later SIMD language *LISP (see Section 3.4.2). Another parallel FORTRAN dialect for the ILLIAC called IVTRAN is discussed in [7]. According to [1] the compiler was never fully debugged.

next up previous contents
Next: ICL DAP Fortran Up: Early data-parallel languages Previous: Early data-parallel languages   Contents
Bryan Carpenter 2002-07-12