Integration of high-level libraries

Libraries are at the heart of our HPspmd model. From one point of view, the language extensions are simply a framework for invoking libraries that operate on distributed arrays. Hence an essential component of the ongoing work is definition of a series of bindings from HPspmd languages to established SPMD libraries and environments. Because the language model is explicitly SPMD, such bindings are a more straightforward proposition than in HPF, where one typically has to pass some extrinsic interface barrier before invoking SPMD-style functions.

We can group the existing SPMD libraries for data parallel programming into three categories. In the first category we have libraries like ScaLAPACK [4] and PetSc [2] where the primary focus is similar to conventional numerical libraries--providing implementations of standard matrix algorithms (say) but operating on elements in regularly distributed arrays. We assume that designing HPspmd interfaces to this kind of package will be relatively straightforward. ScaLAPACK for example, provides linear algebra routines for distributed-memory computers. These routines operate on distributed arrays--specifically, distributed matrices. The distribution formats supported are restricted to two-dimensional block-cyclic distribution for dense matrices and one-dimensional block distribution for narrow-band matrices. Since both these distribution formats are supported by HPspmd, using ScaLAPACK routines from the HPspmd framework should present no fundamental difficulties.

In a second category we place libraries conceived primarily as
underlying support for general parallel programs with regular
distributed arrays. They emphasize high-level communication primitives
for particular styles of programming, rather than specific numerical
algorithms. These libraries include compiler runtime libraries like
Multiblock Parti [1] and Adlib [21], and
application-level libraries like the Global Array toolkit
[17]. Adlib is a runtime library that was
designed to support HPF translation. It provides communication
primitives similar to Multiblock PARTI, plus the Fortran 90
transformational intrinsics for arithmetic on distributed arrays. The
Global Array (GA) toolkit, developed at Pacific Northwest National Lab,
provides an efficient and portable ``shared-memory'' programming
interface for distributed-memory computers. Each process in a MIMD
parallel program can asynchronously access logical blocks of
distributed arrays, without need for explicit cooperation by other
processes (``one-sided communication''). Besides providing a more
tractable interface for creation of multidimensional distributed
arrays, our syntax extensions should provide a more convenient
interface to primitives like `ga_get`, which copies a patch of
a global array to a local array.

Regular problems (such as the linear algebra examples in section
4) are an important subset of parallel applications, but
of course they are far from exclusive. Many important problems involve
data structures too irregular to represent purely through HPF-style
distributed arrays. Our third category of libraries therefore includes
libraries designed to support irregular problems. These include CHAOS
[8] and DAGH [19]. We anticipate that
irregular problems will still benefit from regular data-parallel
language extensions--at some level they usually resort to
representations involving regular arrays. But lower level SPMD
programming, facilitated by specialized class libraries, is likely to
take a more important role. For an HPspmd binding of the CHAOS/PARTI library,
for example, the simplest assumption is that the preprocessing phases
yield new arrays. Indirection arrays may well be left as HPspmd
distributed arrays; data arrays may be reduced to ordinary Java arrays
holding local elements. Parallel loops of an executor phase can then
be expressed using *overall* constructs. More advanced schemes may
incorporate irregular maps into generalized array descriptors
[11,9,7] and require extensions to
the baseline HPspmd language model.