In this section we will discuss the other option for representing complex data buffers in the Java API of [#!MPIPOSITION!#]--introduction of an MPJ.OBJECT datatype.
It is natural to assume that the elements of buffers passed to send and other output operations are objects whose classes implement the Serializable interface. There are at least two ways one may consider communicating object types in the MPI interface
The first implementation scheme is more straightforward, and this approach will be considered in the remainder of this section. We discuss an implementation based on the mpiJava wrappers, combining standard JDK object serialization methods with a JNI interface to native MPI. Benchmark results presented in the next section suggest that something like the second approach (or some suitable combination of the two) deserves serious consideration, hence section 5.5 describes one realization of this scheme.
The original version of mpiJava was a direct Java wrapper for standard MPI. Apart from adopting an object-oriented framework, it added a modest amount of code to the underlying C implementation of MPI. Derived datatype constructors, for example, simply called the datatype constructors of the underlying implementation and returned a Java object containing a representation of the C handle. A send operation or a wait operation, say, dispatched a single C MPI call. Even exploiting standard JDK object serialization and a native MPI package, uniform support for the MPJ.OBECT basic type complicates the wrapper code significantly.
In the new version of the wrapper, every send, receive, or collective communication operation tests if the base type of the datatype argument describing a buffer is OBJECT. If not--if the buffer element type is a primitive type--the native MPI operation is called directly, as in the old version. If the buffer is an array of objects, special actions must be taken in the wrapper. If the buffer is a send buffer, the objects must be serialized. We also support MPI-like derived datatypes as described in the previous section. On grounds of uniformity, these should be definable with base type OBJECT, just as for primitive elements. The message is then some subset of the array of objects passed in the buffer argument, selected according to the displacement sequence of the derived datatype. In the implementation, a method
byte [] Object_Serialize(Object buf,
int offset,
int count,
Datatype type)
takes the send buffer and descriptor, and returns a byte vector
containing the serialized data. At the receiving end a corresponding
Object_deserialize method is called.
Making the Java wrapper responsible for handling derived data types
when the base type is OBJECT requires additional fields in the
Java-side Datatype class. In particular the Java object
explicitly maintains the displacement sequence as an array of
integers.
A further set of changes to the implementation arises because the size of the serialized data is not known in advance, and cannot be computed at the receiving end from type information available there. Before the serialized data is sent, the size of the data must be communicated to the receiver, so that a byte receive buffer can be allocated. We send two physical messages--a header containing size information, followed by the data5.2. This, in turn, complicates the implementation of the various wait and test methods on communication request objects, and the start methods on persistent communication requests, and ends up requiring extra fields in the Java Request class. Comparable changes are needed in the collective communication wrappers. A gather operation, for example, involving object types is implemented as an MPI_GATHER operation to collect all message lengths, followed by an MPI_GATHERV to collect possibly different-sized data vectors.