It is very important for the efficiency of HPJava programs that inner loops in the translated code--especially those derived from overall constructs in the source program--be kept as simple as possible. For example, the body of the emitted loop should avoid calls to methods associated with the runtime descriptor of the distributed array (unless, say, the programmer forced appearance of these calls by including them in the source). In particular runtime checking code associated with references to distributed array element should be lifted out of loops wherever possible.
Rule 1 can be enforced at compile time by a trivial extension to normal type checking. So far as Rule 2 is concerned, the intention is that run-time checks needed to enforce this rule should be lifted outside of the loop body. Consider this fragment
whereoverall(=for::) {......[...,, ...] ...}
with the runtime test appearing immediately before the element reference. What we want to achieve instead is something likeoverall(=for::) {...ASSERT(.rng().containsLocation(,`)) ;...[...,, ...] ...}
In fact we expect that in typical HPJava programs this transformation will be legal. In simple cases there may even be enough static information about the ranges ofASSERT(.rng().containsSubrange(,,,)) ;overall(=for::) {......[...,, ...] ...}
There are cases where lifting the usage check may be problematic. Consider this example
Range y = x [1 : N - 2] ;
float [[]] a = float [[y]] ;
overall(i = x for 0 : N - 1) {
...
if(i` > 0 && i` < N - 1)
... a [i] ...
}
The array is defined over some subrange of x. We iterate
over the whole of x, but mask the disallowed accesses with a
nested conditional construct. Such usage would block naive lifting of
runtime range-checks out of the loop. The lifted range-check would
cause an exception although the code is apparently legal.
Another potential difficultly is that the expression
float [] [[]] s = new float [n] [[]] ;
... allocate individual distributed arrays in `s' ...
overall(i = x for :) {
...
for(int j = 0 ; j < n ; j++)
s [j] [i] = foo(j, i`) ;
}
The variable s is a Java array of HPJava distributed arrays.
The array expression, We assume these uses are idiosynchratic. In the first case the desired effect could be normally achieved more directly by changing the triplet in the overall header7:
overall(i = x for 1 : N - 2) {
...
... a [i] ...
}
In the second example the appearance of i as a common subscript for all
distributed arrays in s strongly suggests that these have a
common range, probably x. So replacing s with a two-dimensional
distributed array,
float [[*,]] s = new float [[n, x]] ;
would probably be acceptable, and would remove the blockage to lifting
the range-check.
Because efficient translation is a primary concern, and the examples above are difficult to translate efficiently, we outlaw them by adding a new rule
Next we consider Rule 3 of section 4. We assume that the translator makes the current state of the active process group visible in a variable of type Group called apg9. Most naively, the range checks for the example in Figure 6 could be inserted as follows:
float [[,]] a = new float [[x, y]] on p ;
float [[]] c = a [[0, :]] ;
on(p / x [N - 1])
overall(j = y for :) {
ASSERT((c.grp() / j).contains(apg)) ;
c [j] = j` ;
}
We note, however, that this can be optimized by lifting the assertion
through the overall scoping j:
on(p / x [N - 1]) {
ASSERT(c.grp().contains(apg)) ;
overall(j = y for :)
c [j] = j` ;
}
and the lifted assertion is even simpler to compute, because the
restriction by j is no longer needed.
In general in
provided the ellided code between the overall header and the assertion contains no statements that change apg (ie, no other overall or on headers) the assertion can be lifted and simplified as follows:overall(=for::) {...ASSERT((.grp() / ... //// ... ).contains(apg)) ;...[...,, ...] ......}
We can legitimize the lifting, even if the element reference appears in conditional code inside the loop, by appealing to Rule 4. If overall constructs are suitably nested this transformation can be applied recursively.ASSERT((.grp() / ... /// ... ).contains(apg)) ;overall(=for::) {......[...,, ...] ......}
Exact computation of the group contains operation may be expensive--in principle it is a set containment test. In the present case, where containment of the active process group is the issue, an alternative is available. If all members of the active process group make the assertion
this is essentially equivalent to them all making the assertionASSERT(.contains(apg)) ;
The amMember method on a group returns true if the the local process is a member of the group and false otherwise. If the contains assertion is false, then at least one member of the active process group will throw an exception. If an exception is expected to abort the program globally, this is probably good enough. The amMember function can be computed very efficiently--in fact an extra boolean field should be added to the Group record containing this precomputed value, so it is simply a lookup.ASSERT(.amMember()) ;
Whether this approximation to the exact contains test is acceptable in practise is a slightly open issue. In ``normal'' styles of HPJava programming we expect that it is essentially equivalent to the exact set containment test. But in principle its adoption alters the semantics of the program. Replacing contains by amMember only weakens the requirement of Rule 3, which may not be a problem, unless Java-like exact exception handling is expected. In general our policy of lifting runtime checks is somewhat at odds with the Java ethos that exceptions should be thrown and control switched at exactly the point where the error (eg, the erroneous subscripting operation) occurred. In HPJava the most important thing is to get good performance, and probably some aspects of the strict Java exception model have to be sacrificed.
We did not discuss use of shifted locations as subscripts. They do not introduce any fundamental issues. A runtime inquiry on array ranges will be needed giving the size of the ghost extensions. Rule 4 can be interpretted to mean that use of non-invariant expressions for the shift-amounts is disallowed.