All the previous examples considered patterns of communication occurring in array parallel statements--array assignments or FORALL statements. These communication patterns are quite naturally treated as generalized collective operations. But there are situations in HPF--and in general SPMD programming--where this approach is not readily applicable.
One example is the INDEPENDENT DO loop of HPF, which takes the form:
!HPF$ INDEPENDENT DO i = 1, 10 ... END DOThe INDEPENDENT directive asserts that there are no data dependences between individual iterations of the following loop, and the iterations may therefore be executed in parallel4. Unlike the FORALL statement, which explicitly limits the code executed in parallel to simple assignments, the body of an INDEPENDENT DO can involve any Fortran construct, including conditionals, loops and procedure calls. So the patterns of access to remote data inside parallel ``iterations'' may vary in unpredictable ways from one iteration to the next. It may become difficult to do any advance orchestration of data exchanges. An HPF the compiler is free to ignore the INDEPENDENT directive if it decides the loop is too complex to parallelize. But this may deprive the programmer of one of the few options in HPF for expressing the task-farming style of parallelism.
Actually there is at least one other way to express task parallelism in HPF. A user-defined procedure with the PURE attribute can be called from within a FORALL statement:
PURE REAL FUNCTION FOO(INTEGER I) ... END ... FORALL (I = 1 : N) RES (I) = FOO(I)There are quite strict restrictions on PURE procedures, but nothing to prevent a procedure from reading elements of global distributed data--a distributed array in a COMMON block, for example. Unfortunately this makes it difficult or impossible for the compiler to determine at the point of call of FOO exactly what remote variables it will access. For example, the actual behaviour of the program might be similar to the first example of Section 1.4:
PURE REAL FUNCTION FOO(INTEGER I) REAL RES(50) INTEGER IND(50) !HPF$ DISTRIBUTE RES(BLOCK) ONTO P !HPF$ DISTRIBUTE IND(BLOCK) ONTO P COMMON /GLOBALS/ RES, IND RETURN RES (IND (I)) ENDBut by the time
RES(IND(I))is accessed, instances of the function FOO have already been dispatched to execute independently across the available set of processors. In a real sense, once inside FOO processors are no longer sharing a single ``loosely synchronous'' thread of control. It is difficult to see how the parallel invocations of FOO can behave collectively. In particular if the underlying model is MPI point-to-point communication it is difficult to see how the owner of a particular array element can always be ready to send an element when its value is accessed by a peer processor.
INDEPENDENT DO loops have similar problems, compounded because they do not have the restrictions on PURE procedures that prevent them from writing to global variables. If this sort of code is to be compiled to run in parallel the most practical approach is probably to assume the availability of one-sided communication. The MPI 2 standard added this functionality to MPI, but it is still not widely implemented.