- Idealized picture of a distributed memory parallel computer
- A data distribution leading to excessive communication
- A data distribution leading to poor load balancing
- An ideal data distribution
- Alignment of the three arrays in the LU decomposition example
- Collective communications with 6 processes.
- Simple sequential irregular loop.
- PARTI code for simple parallel irregular loop.
- The process grids illustrated by
`p` - The
`Group`hierarchy of HPJava. - The process dimension and coordinates in
`p`. - A two-dimensional array
distributed over
`p`. - A parallel matrix addition.
- Mapping of
`x`and`y`locations to the grid`p`. - Solution for Laplace equation by Jacobi relaxation in HPJava.
- The HPJava Range hierarchy.
- Work distribution for block distribution.
- Work distribution for cyclic distribution.
- Solution for Laplace equation using red-black with ghost regions in HPJava.
- A two-dimensional array,
`a`, distributed over one-dimensional grid,`q`. - A direct matrix multiplication in HPJava.
- Distribution of array elements in
example of Figure 3.13. Array
`a`is replicated in every column of processes, array`b`is replicated in every row. - A general matrix multiplication in HPJava.
- A one-dimensional section of a two-dimensional array (shaded area).
- A two-dimensional section of a two-dimensional array (shaded area).
- The dimension splitting in HPJava.
- N-body force computation using dimension splitting in HPJava.
- The HPJava Architecture.
- Part of the lattice of types for
multiarrays of
`int`. - Example of HPJava AST and its nodes.
- Hierarchy for statements.
- Hierarchy for expressions.
- Hierarchy for types.
- The architecture of HPJava Front-End.
- Example source program before pre-translation.
- Example source program after pre-translation.
- Translation of a multiarray-valued variable declaration.
- Translation of multiarray creation expression.
- Translation of
`on`construct. - Translation of
`at`construct. - Translation of
`overall`construct. - Translation of multiarray element access.
- Translation of array section with no scalar subscripts.
- Translation of array section without any scalar subscripts in
*distributed*dimensions. - Translation of array section allowing scalar subscripts in distributed dimensions.
- Partial Redundancy Elimination
- Simple example of Partial Redundancy of loop-invariant
- Before and After Strength Reduction
- Translation of
`overall`construct, applying Strength Reduction. - The innermost for loop of naively translated direct matrix multiplication
- Inserting a landing pad to a loop
- After applying PRE to direct matrix multiplication program
- Optimized HPJava program of direct matrix multiplication program by PRE and SR
- The innermost overall loop of naively translated Laplace equation using red-black relaxation
- Optimized HPJava program of Laplace equation using red-black relaxation by HPJOPT2
- Performance for Direct Matrix Multiplication in HPJOPT2, PRE, Naive HPJava, Java, and C on the Linux machine
- Performance comparison between 150 150 original Laplace equation from Figure 3.11 and split version of Laplace equation.
- Memory locality for loop unrolling and split versions.
- Performance comparison between 512 512 original Laplace equation from Figure 3.11 and split version of Laplace equation.
- 3D Diffusion Equation in Naive HPJava, PRE, HPJOPT2, Java, and C on Linux machine
- Q3 Index Algorithm in HPJava
- Performance for Q3 on Linux machine
- Performance for Direct Matrix Multiplication on SMP.
- 512 512 Laplace Equation using Red-Black Relaxation
without
`Adlib.writeHalo()`on shared memory machine - Laplace Equation using Red-Black Relaxation on shared memory machine
- Laplace Equation using Red-Black Relaxation on distributed memory machine
- 3D Diffusion Equation on shared memory machine
- 3D Diffusion Equation on distributed memory machine
- Q3 on shared memory machine
- A simple Jacobi program in ZPL.
- Running an HPJava Swing program,
`Wolf.hpj`.

Bryan Carpenter 2004-06-09