next up previous
Next: Array sections and type Up: Global variables Previous: Global variables

Data descriptor and global data.

The concept of data descriptor is not new. It exists in the Java language itself. For example, the field length in the Java array reflects the fact that an array is accessed through a data descriptor.

On a single processor, an array variable might be parametrized by a simple record containing a memory address and an int value for the the length. On a multi-processor, a more complicated structure is needed to describe a distributed array. The data descriptor specifies where the data is created, and how are they are distributed. The logical structure of a descriptor is shown in figure 2.

Figure 2: Descriptor
\begin{figure}\begin{center}
\leavevmode
\psfig{figure=descriptor.eps} \end{center}\end{figure}

New syntax is added in HPJava to define data with descriptors.

on(p)
  int # s = new int #;
creates a global scalar on the current executing process group. In the statement, s is a data descriptor handle, in HPJava term, a global scalar reference. The scalar contains an integer value. Global scalar references can be defined for any primitive type (or, in principle, class type) of Java. The symbol # in the right hand side of the assignment indicates a data descriptor is allocated as the scalar is created.

For a scalar variable, a field value is used to retrieve the value.

on(p) {
  int # s = new int #;
  s.value = 100;
}
Figure 3 shows a possible memory mapping for this scalar on different processes.
Figure 3: Memory mapping
\begin{figure}\begin{center}
\leavevmode
\psfig{figure=map.eps} \end{center}\end{figure}
Note, the value field of s is identical in each process in the current executing processes. Replicated value variables are different from local variables with replicated names. The associated descriptors can be used to ensure the value is maintained identically in each process, throughout program execution.

The group inside a descriptor is called the data owner group, it defines where the global values are held.

on(p)
  int # s = new int # on q;
will set data owner field in the descriptor to group q. In general this may be a subset of the default, p (the whole of the current active process group).

When defining a global array, it is not necessary to allocate a data descriptor for each array element. So the syntax for defining a global array is not derived directly from the one for a scalar. An array can defined with different kinds of ranges introduced earlier. Suppose we still have

Range x = new BlockRange(100, p.dim(0)) ;
and the process group defined in figure 1, then
on(q) 
  float [[ ]] a = new float [[x]];
will create a global array with range y on group q. Here a is a descriptor handle describing a one-dimensional array of float. It is block distributed on group q1. In HPJava term, a is also called a global or distributed array reference.

A distributed array range can also be collapsed (or sequential). An integer range is specified, eg

on(p)
  float [[*]] b = new float [[100]];
When defining an array with collapsed dimensions, * can optionally be added in a type signatures to mark these dimensions.

The typical method of accessing global array elements is not exactly the same as for local array elements, or for global scalar references. Since global arrays may have position information in their dimensions, we often use locations as their subscripts:

Location i=x[3];
at(i)
  a[i]=3;
Here the fourth element of array a is assigned the value 3. We will leave discussion of the at construct to section 2.3, and give a simpler example here: if a global array is defined with a collapsed dimension, accessing its elements can be modelled on local arrays. For example:
for(int i=0; i<100; i++)
  b[i]=i;
assigns the loop index to each corresponding element in the array.

When defining a multi-dimensional global array, a single descriptor parametrizes a rectangular array of any dimensions.

Range x = new BlockRange(100, p.dim(0)) ;  
Range y = new CyclicRange(100, p.dim(1)) ;
float [[,]] c = new float [[x, y]];
This creates a two-dimension global array with the first dimension block distributed and the second cyclic distributed. Now c is a global array reference. Its elements can be accessed using single brackets with two suitable locations inside.

The global array introduced here is a Fortran-style multi-dimensional array, not a Java-like array-of-arrays. Java-style arrays-of-arrays are still useful. For example, one can define a local array of distributed arrays:

int[] size = {100, 200, 400};
float [[,]] d[] = 
        new float [size.length][[,]] ;
Range x[];
Range y[];
for (int l = 0; l < size.length; l++) {
  const int n = size [l] ;
  x[l] = new BlockRange(n, p.dim(0)) ;  
  y[l] = new BlockRange(n, p.dim(1)) ;  
  d[l] = new float [[x[l], y[l]]];
}
creates an array shown in figure 4.

Figure 4: Array of distributed array
\begin{figure}\begin{center}
\leavevmode
\psfig{figure=layer.eps} \end{center}\end{figure}


next up previous
Next: Array sections and type Up: Global variables Previous: Global variables
Bryan Carpenter 2002-07-11