Next: Case Study 2: Red-Black
Up: Case Study 1: Matrix
Previous: Optimization Strategies
For purposes of this paper--and to understand what directions are most
profitable for future development--the simplest and most practical
kinds of optimization strategies have been applied ``manually'' (but we believe
honestly--i.e. it seems straightforward to apply these strategies
automatically, with a modest amount of extra work on the translator)
to the HPJava matrix multiplication programs. The matrix sizes are
100 by 100. The results use the IBM Developer Kit 1.3 (JIT) with
-O flag on Pentium4 1.5GHz Red Hat 7.2 Linux machines. We also
compared the sequential Java, C++, and Fortran version of the HPJava
program, all with -O or -O5 (i.e. maximun optimization)
flag when compiling.
Table 1:
Matrix multiplication performance.
HPJava code for single processor. Other
codes sequential. All speeds in MFLOPS.
| |
|
Strength |
Loop |
| HPJava |
Naive |
Reduction |
unrolling |
| HPJava |
125.0 |
166.7 |
333.4 |
| Java |
251.3 |
403.2 |
388.3 |
| C++ |
533.3 |
436.7 |
552.5 |
| F77 |
536.2 |
531.9 |
327.9 |
|
Table 1 shows that the strength reduction optimization
helps in the Java case (for C++ and Fortran, the
compilers can be expected to implement similar optimizations, and we
don't see any improvement from the source-level ``optimizations'').
The main message is that, with a little help in the form of
source-to-source transformations, Java can get 70-75% of the
performance of C++ or Fortran. After similar optimizations, HPJava
with more complex subscripting expressions has similar but marginally
slower performance. Of course we expect that the HPJava results will
scale on suitable parallel platforms, so a modest penalty
in node performance is considered acceptable.
Next: Case Study 2: Red-Black
Up: Case Study 1: Matrix
Previous: Optimization Strategies
Bryan Carpenter
2004-04-24