next up previous
Next: HPJava an HPspmd language Up: Translation Schemes for the Previous: Translation Schemes for the

Introduction

HPJava [3] is a language for parallel programming, especially suitable for programming massively parallel, distributed memory computers.

Several of the ideas in HPJava are lifted from the High Performance Fortran (HPF) programming language. However the programming model of HPJava is ``lower level'' than the programming model of HPF. HPJava sits somewhere between the explicit SPMD (Single Program Multiple Data) programming style--often implemented using communication libraries like MPI--and the higher level, data-parallel model of HPF. An HPF compiler generally guarantees an equivalence between the parallel code it generates and a sequential Fortran program obtained by deleting all distribution directives. An HPJava program, on the other hand, is defined from the start to be a distributed MIMD program, with multiple threads of control operating in different address spaces. In this sense the HPJava programming model is ``closer to the metal'' than the HPF programming model. HPJava does provide special syntax for HPF-like distributed arrays, but the programming model may best be viewed as an incremental improvement on the programming style used in many hand-coded applications: explicitly parallel programs exploiting collective communication libraries or collective arithmetic libraries.

We call this general programming model--essentially direct SPMD programming supported by additional syntax for HPF-like distributed arrays--the HPspmd model. In general SPMD programming has been very successful. Many high-level parallel programming environments and libraries assume the SPMD style as their basic model. Examples include ScaLAPACK [1], DAGH [14], Kelp [6] and the Global Array Toolkit [13]. While there remains a prejudice that HPF is best suited for problems with rather regular data structures and regular data access patterns, SPMD frameworks like DAGH and Kelp have been designed to deal directly with irregularly distributed data, and other libraries like CHAOS/PARTI [5] and Global Arrays support unstructured access to distributed arrays. Presently, however, the library-based SPMD approach to data-parallel programming lacks the uniformity and elegance that was promised by HPF. The various environments referred to above all have some idea of a distributed array, but they all describe those arrays differently. Because the arrays are managed entirely in libraries, the compiler offers little support and no safety net of compile-time or compiler-generated run-time checking. The HPspmd model is one attempt to address such shortcomings.

HPJava is a particular instantiation of this HPspmd idea. As the name suggests, the base language in this case is the $\mbox{Java}^{\mbox{\scriptsize TM}}$ programming language. To some extent the choice of base language is incidental, and clearly we could have added equivalent extensions to another language, such as Fortran itself. But Java does seem to be a better language in various respects, and it seems plausible that in the future more software will be available for modern object-oriented languages like Java than for Fortran.

HPJava is a strict extension of Java. It incorporates all of Java as a subset. Any existing Java class library can be invoked from an HPJava program without recompilation. As explained above, HPJava adds to Java a concept of multi-dimensional, distributed arrays, closely modelled on the arrays of HPF1. Regular sections of distributed arrays are fully supported. The multidimensional arrays can have any rank, and the elements of distributed arrays can have any standard Java type, including Java class types and ordinary Java array types.

A translated and compiled HPJava program is a standard Java class file, which will be executed by a distributed collection of Java Virtual Machines. All externally visible attributes of an HPJava class--e.g. existence of distributed-array-valued fields or method arguments--can be automatically reconstructed from Java signatures stored in the class file. This makes it possible to build libraries operating on distributed arrays, while maintaining the usual portability and compatibility features of Java. The libraries themselves can be implemented in HPJava, or in standard Java, or through Java Native Interface (JNI) wrappers to code implemented in other languages. The HPJava language specification carefully documents the mapping between distributed arrays and the standard-Java components they translate to.

While HPJava does not incorporate HPF-like ``sequential'' semantics for manipulating its distributed arrays, it does add a small number of high-level features designed to support direct programming with distributed arrays, including a distributed looping construct called overall. To directly support lower-level SPMD programming, it also provides a complete set of inquiry functions that allow the local array segments in distributed arrays to be manipulated directly, where necessary.

In the current system, syntax extensions are handled by a preprocessor that emits an ordinary SPMD program in the base language. The HPspmd syntax provides a relatively thin veneer on low-level SPMD programming, and the transformations applied by the translator are correspondingly direct--little non-trivial analysis should be needed to obtain good parallel performance. What the language does provide is a uniform model of a distributed array. This model can be targetted by reusable libraries for parallel communication and arithmetic. The specific model adopted very closely follows the distributed array model defined in the High Performance Fortran standard.

This article describes ongoing work on refinement of the HPJava language definition, and the development of a translator for this language.


next up previous
Next: HPJava an HPspmd language Up: Translation Schemes for the Previous: Translation Schemes for the
Bryan Carpenter 2002-07-12