next up previous contents
Next: Conclusion and Future Work Up: Applications and Performance Previous: Graphical Simulation   Contents


Ising Model Evaluation

In this section we evaluate the performance of some applications written in mpiJava. We have developed both sequential and parallel message passing programs to compare the performance of mpiJava using Monte Carlo simulation of the two dimensional Ising model. The sequential programs were written in C, F77 and Java and message passing programs were written in MPI-C, MPI-F77 and mpiJava. This Ising model test can also be used for testing the quality of parallel random number generators [#!KO!#].

The two different methods, Metropolis and Swendsen-Wang cluster algorithms are used with a standard block domain decomposition. As in Potts model, the red/black updating scheme is used in parallel Metropolis Ising model simulation [#!FoxBook88!#].

Metropolis is easy to parallelize since it is just local nearest neighbor communication and standard domain decomposition so it's easy to load balance, Swendsen-Wang needs non-local communication and has fairly good load balance. Therefore we would expect Metropolis to give the best speedups, then Swendsen-Wang would be not quite as good.

The system environment was as follows:

For comparison, we have completed experiments for three different versions of our programming environment--sequential codes, parallel codes with MPICH and Sun HPC MPI. For the sequential F77 and C codes, we use Sun WorkShop FORTRAN 77 5.0 and C 5.0 compilers and Sun JDK1.2.2 for sequential Java code. For the parallel mpiJava codes, we use MPICH 1.2.0 and Sun HPC MPI 2.0 on 1, 2, 4 and 8 nodes. The MPICH 1.2.0 was compiled with -comm=shared communication option to permit the use of communication using both shared memory and TCP/IP, since each of four our Solaris machines has dual processors. Sun HPC MPI uses the Sun ATM network which is much faster than p4 device of MPICH. For better performance, all sequential and parallel Fortran, C and Java codes were compiled using -O optimization option and all sequential Java and mpiJava codes executed with JIT enabled.

We performed Monte Carlo Ising simulations on simple 2-D lattices with linear size of L=32 to 1024 and periodic boundary conditions. To measure the total execution time, we performed 20 iterations to thermalize the system and at least 2000 Monte Carlo sweeps. The inverse temperature $\beta$ was taken to be the critical inverse temperature $\beta_c = \log(1+\sqrt{2})/2 \approx 0.4406868$ for the 2-D Ising model. Timings were measured using MPI_Wtime for parallel codes and the shell built-in time command for sequential codes. We have repeated the benchmarks several times when there was little network activity and on quiet machines.

The Metropolis timing results for sequential and parallel tests for different lattice sizes are shown in Figure 6.7 through 6.12 and Swendsen-Wang in Figure 6.13 through 6.18. The results demonstrate as follows;

Figure 6.7: Metropolis Performance with Lattice Size $32^2$.
Figure 6.8: Metropolis Performance with Lattice Size $64^2$.
\begin{figure}\centerline{\psfig{figure=Figs/met32log.eps,width=5in}}\par\bigski...
...gskip\bigskip\centerline{\psfig{figure=Figs/met64log.eps,width=5in}}\end{figure}

Figure 6.9: Metropolis Performance with Lattice Size $128^2$.
Figure 6.10: Metropolis Performance with Lattice Size $256^2$.
\begin{figure}\centerline{\psfig{figure=Figs/met128log.eps,width=5in}}\par\bigsk...
...skip\bigskip\centerline{\psfig{figure=Figs/met256log.eps,width=5in}}\end{figure}

Figure 6.11: Metropolis Performance with Lattice Size $512^2$.
Figure 6.12: Metropolis Performance with Lattice Size $1024^2$.
\begin{figure}\centerline{\psfig{figure=Figs/met512log.eps,width=5in}}\par\bigsk...
...kip\bigskip\centerline{\psfig{figure=Figs/met1024log.eps,width=5in}}\end{figure}

Figure 6.13: Swendsen-Wang Performance with Lattice Size $32^2$.
Figure 6.14: Swendsen-Wang Performance with Lattice Size $64^2$.
\begin{figure}\centerline{\psfig{figure=Figs/sw32log.eps,width=5in}}\par\bigskip\bigskip\bigskip\centerline{\psfig{figure=Figs/sw64log.eps,width=5in}}\end{figure}

Figure 6.15: Swendsen-Wang Performance with Lattice Size $128^2$.
Figure 6.16: Swendsen-Wang Performance with Lattice Size $256^2$.
\begin{figure}\centerline{\psfig{figure=Figs/sw128log.eps,width=5in}}\par\bigski...
...gskip\bigskip\centerline{\psfig{figure=Figs/sw256log.eps,width=5in}}\end{figure}

Figure 6.17: Swendsen-Wang Performance with Lattice Size $512^2$.
Figure 6.18: Swendsen-Wang Performance with Lattice Size $1024^2$.
\begin{figure}\centerline{\psfig{figure=Figs/sw512log.eps,width=5in}}\par\bigski...
...skip\bigskip\centerline{\psfig{figure=Figs/sw1024log.eps,width=5in}}\end{figure}

Figure 6.19: Speedup of Metropolis by using mpiJava as compared with serial Java.
Figure 6.20: Speedup of Swendsen-Wang by using mpiJava as compared with serial Java.
\begin{figure}\centerline{\psfig{figure=Figs/met_speedup.eps,width=5in}}\par\big...
...kip\bigskip\centerline{\psfig{figure=Figs/sw_speedup.eps,width=5in}}\end{figure}


Table 6.1: Metropolis : Comparing Java with F77 and C.
  SunHPC MPICH Sequential
Nodes Java/F77 Java/C Java/F77 Java/C Java/F77 Java/C
 
2 5.17 3.33 2.80 2.34 3.86 3.14
4 2.25 2.00 1.20 1.21  
8 1.34 1.23 1.13 1.14  

2 4.71 2.74 3.57 2.37 3.51 2.87
4 2.69 2.06 1.59 1.47  
8 1.99 1.57 1.42 1.22  

$L=128^2$
2 4.52 2.60 4.00 2.42 3.33 2.74
4 3.43 2.33 2.18 1.79  
8 2.48 2.08 1.67 1.54  

$L=256^2$
2 3.94 2.18 4.05 2.36 3.30 2.75
4 3.88 2.29 2.72 1.90  
8 3.07 2.08 2.40 1.84  

$L=512^2$
2 3.33 2.20 3.31 2.23 3.01 2.73
4 3.56 2.21 2.96 2.02  
8 3.38 2.17 2.83 2.00  

$L=1024^2$
2 2.73 2.08 2.66 2.07 2.43 2.71
4 2.84 2.14 2.67 2.03  
8 2.85 2.11 2.71 2.02  



Table 6.2: Swendsen-Wang : Comparing Java with F77 and C.
  SunHPC MPICH Sequential
Nodes Java/F77 Java/C Java/F77 Java/C Java/F77 Java/C
 
2 5.16 3.49 2.37 2.13 3.60 3.48
4 2.27 2.10 1.11 1.10  
8 1.43 1.39 1.08 1.08  

2 4.27 2.71 2.92 2.21 3.29 3.13
4 2.37 2.11 1.24 1.22  
8 1.80 1.64 1.14 1.15  

$L=128^2$
2 3.75 2.39 3.24 2.24 3.21 3.18
4 2.88 2.08 1.51 1.38  
8 2.07 1.74 1.33 1.30  

$L=256^2$
2 3.07 2.31 3.16 2.41 2.63 3.18
4 3.10 2.12 2.00 1.67  
8 2.57 1.99 1.59 1.47  

$L=512^2$
2 2.52 2.32 2.57 2.39 1.73 2.98
4 2.45 2.12 2.03 1.86  
8 2.44 2.08 1.84 1.71  

$L=1024^2$
2 1.83 2.15 1.98 2.35 1.14 2.90
4 1.87 2.04 1.76 1.96  
8 1.85 2.07 1.84 2.03  



next up previous contents
Next: Conclusion and Future Work Up: Applications and Performance Previous: Graphical Simulation   Contents
Bryan Carpenter 2004-06-09