next up previous
Next: mpiJava implementation on WMPI Up: Introduction to the mpiJava Previous: Special features of the

Derived datatype vs Object serialization

In MPI new derived types of class Datatype can be created using suitable library functions. The derived types allow one to treat contiguous, strided, or indirectly indexed segments of program arrays as individual message elements. The corresponding array subsections can then be communicated in a single function call, potentially exploiting any special hardware or software the platform provides for exchanging scattered data between user space and the communication system.

Currently mpiJava provides all the derived datatype constructors of standard MPI, with one limitation: it places significant restrictions on its binding of MPI_TYPE_STRUCT. In C or Fortran this function can be used to describe an entity combining fields of different primitive (or derived) type. Because of the assumption that buffers are one-dimensional arrays with elements of primitive type, mpiJava imposes a restriction that all the types combined by its Datatype.Struct member must have the same base type, which must agree with the element type of the buffer array. Also mpiJava does not provide an analogue of MPI_BOTTOM buffer address, or the MPI_ADDRESS function for finding offsets relative to this absolute memory base. In C or Fortran these functions allow buffers to include fields from separately declared variables or arrays, but the mechanism does not sit very well with the pointer-free Java language model.

Approaches based on the MPI derived datatype model do not seem to be the best way to alleviate this restriction. A better option is probably to exploit the run-time type information already provided in Java objects. We are developing a version of mpiJava that adds one new predefined datatype:

  MPI.Object
A message buffer can then be an array of any serializable Java objects. The objects are serialized automatically in the wrapper of send operations, and unserialized at their destination.

The absence of true multi-dimensional arrays in Java limits another use of derived data types. In MPI the MPI_TYPE_VECTOR function creates a derived datatype representing a strided section of an array. In C or Fortran this strided section can be identified with a section of a multi-dimensional array. (It could describe, say, an edge of the local patch of a two-dimensional distributed array.) In Java there is no equivalence between a multi-dimensional array and a contiguous patch of memory, or a one-dimensional array. The programmer may choose to linearize all multi-dimensional arrays in the algorithm, representing them as one-dimensional arrays with suitable index expressions. In this case derived datatypes can be used to send and receive sections of the array. Alternatively the programmer may use Java arrays of arrays to represent multi-dimensional arrays. This simplifies the index arithmetic in the program. Sections of the array are then explicitly copied to one-dimensional buffers for communication. The latter option seems to be more popular with programmers.

Although, for reasons of conformance of with MPI standards, we expect to continue supporting derived datatypes in mpiJava, their value in the Java domain is less clearcut than in C or Fortran. Allowing serializable objects as buffer elements is probably a more powerful facility.


next up previous
Next: mpiJava implementation on WMPI Up: Introduction to the mpiJava Previous: Special features of the
Bryan Carpenter 2002-07-11