next up previous contents
Next: Organization of This Dissertation Up: Introduction Previous: Introduction   Contents


Research Objectives

Historically, data parallel programming and data parallel languages have played a major role in high-performance computing. By now we know many of the implementation issues, but we remain uncertain what the high-level programming environment should be. Ten years ago the High Performance Fortran Forum published the first standardized definition of a language for data parallel programming. Since then, substantial progress has been achieved in HPF compilers, and the language definition has been extended because of the requests of compiler-writers and the end-users. However, most parallel application developers today don't use HPF. This does not necessarily mean that most parallel application developers prefer the lower-level SPMD programming style. More likely, we suggest, it is either a problem with implementations of HPF compilers, or rigidity of the HPF programming model. Research on compilation technique for HPF continues to the present time, but efficient compilation of the language seems still to be an incompletely solved problem. Meanwhile, many successful applications of parallel computing have been achieved by hand-coding parallel algorithms using low-level approaches based on the libraries like Parallel Virtual Machine (PVM), Message-Passing Interface (MPI), and Scalapack. Based on the observations that compilation of HPF is such difficult problem while library based approaches more easily attain acceptable levels of efficiency, a hybrid approach called HPspmd has been proposed[8]. Our HPspmd Model combines data parallel and low-level SPMD paradigms. HPspmd provides a language framework to facilitate direct calls to established libraries for parallel programming with distributed data. A preprocessor supports special syntax to represent distributed arrays, and language extensions to support common operations such as distributed parallel loops. Unlike HPF, but similar to MPI, communication is always explicit. Thus, irregular patterns of access to array elements can be handled by making explicit bindings to libraries. The major goal of the system we are building is to provide a programming model which is a flexible hybrid of HPF-like data-parallel language and the popular, library-oriented, SPMD style. We refer to this model as the HPspmd programming model. It incorporates syntax for representing multiarrays, for expressing that some computations are localized to some processors, and for writing a distributed form of the parallel loop. Crucially, it also supports binding from the extended languages to various communication and arithmetic libraries. These might involve simply new interfaces to some subset of PARTI, Global Arrays, Adlib, MPI, and so on. Providing libraries for irregular communication may well be important. Evaluating the HPspmd programming model on large scale applications is also an important issue. What would be a good candidate for the base-language of our HPspmd programming model? We need a base-language that is popular, has the simplicity of Fortran, enables modern object-oriented programming and high performance computing with SPMD libraries, and so on. Java is an attractive base-language for our HPspmd model for several reasons: it has clean and simple object semantics, cross-platform portability, security, and an increasingly large pool of adept programmers. Our research aim is to power up Java for the data parallel SPMD environment. We are taking on a more modest approach than a full scale data parallel compiler like HPF. We believe this is a proper approach to Java where the situation is rapidly changing and one needs to be flexible.


next up previous contents
Next: Organization of This Dissertation Up: Introduction Previous: Introduction   Contents
Bryan Carpenter 2004-06-09