next up previous contents
Next: Monte Carlo Simulations of Up: Applications and Performance Previous: Applications and Performance   Contents

Parallel Random Number Generator

Random number generators are widely used for simulations in computational science and engineering application. For some Java Grande applications, such Monte Carlo simulations, it is crucial that the random number generator have good randomness properties. This is especially true in large-scale simulations on parallel supercomputers, which consume huge quantities of random numbers, and require parallel algorithms for random number generation. The techniques for testing the quality of parallel random number are presented in [#!KO!#].

Random number generators use iterative deterministic algorithms for producing a sequence of pseudo-random numbers that approximate a truly random sequence. Probably the most commonly-used algorithms for sequential random number generators are Linear Congruential Generators (LCGs). LCGs produce a sequence of random integers using the relation

\begin{displaymath}
X_{i} = (a * X_{i-1} + b) \bmod M,
\end{displaymath} (6.1)

where $a$ is known as the multiplier, $M$ is the modulus, and $b$ is an additive constant. If $b=0$, these are termed Multiplicative Linear Congruential Generators (MLCG). The parameters ($a,b,M$) must be chosen carefully to ensure a large period and good uniformity and randomness properties. Standard random number generators return a floating point number $x_{i}$ in the interval [0,1), which can be obtained by dividing by $M$.

Many different parallel random number generators have been proposed [#!ANDERSON!#,#!Coddington!#], but most of them use the same basic concept, which is to parallelize a sequential generator by taking the elements of the sequence of pseudo-random numbers it generates and distributing them among the processors in some way. One of the method to achieve this goal is to split the sequence into non-overlapping contiguous sections, each of which is generated by a different processor. This also utilizes the fact that this type of generator allows an arbitrary element of the sequence to be calculated. One can therefore simply divide the period of the generator by the number of processors, and jump ahead in the sequence by this amount for each processor. Alternatively, the length of each section of numbers could be chosen much larger than could be used by any processor. If the length of the sections is $L$, then processor $P$ would generate the sequence

\begin{displaymath}
X_{PL}, X_{PL+1}, X_{PL+2}, \ldots,
\end{displaymath}

This method requires the ability to jump ahead in the sequence by a given amount.

Another simple method for parallelizing random number generators is for processor $P$ of an processor machine to generate the sub-sequence

\begin{displaymath}
X_{P}, X_{P+N}, X_{P+2N}, \ldots,
\end{displaymath}

so that the sequence is spread across processors in the same way as a deck of cards is dealt in turn to players in a card game. This is known as the leapfrog technique, since each processor leap-frogs by in the sequence. In order to use this method we need to be able to easily jump ahead in the sequence. Linear congruential generators are simple enough that it is possible to specify analytically an arbitrary element in the sequence. By recursively applying equation 6.1, it is easy to show that the element $X_{P+N}$ is given by
$\displaystyle X_{P+N}$ $\textstyle =$ $\displaystyle (A * X_{P} + B) \bmod M$ (6.2)

with
$\displaystyle A$ $\textstyle =$ $\displaystyle a^N$ (6.3)
$\displaystyle B$ $\textstyle =$ $\displaystyle \sum_{k=0}^{N-1} a^k = (a^N - 1)/(a - 1)$  

Thus moving through the sequence with stride , rather than 1, is simply accomplished by specifying a different multiplier and additive constant for the generator.
Figure 6.1: Parallel Multiplicative Linear Congruential Generators (MLCG).
\begin{figure}\footnotesize\noindent \mbox{}\hrulefill\mbox{}
\begin{verbatim}...
...);
}
}\end{verbatim}\normalsize\noindent \mbox{}\hrulefill\mbox{}
\end{figure}

In the mpiJava Potts model and Ising model simulations, we use the parallel MLCG using leapfrog algorithm as shown in Figure 6.1. The leapfrog method is consequently very easy to implement on a parallel machine. The idea is to take a standard linear congruential algorithm, and have every processor compute the Nth iterate of that algorithm (where N is the number of processors.) [#!FoxBook88!#]. Each node computes random numbers using Eq. 6.2 and $A$ and $B$ in Eq. 6.3 are calculated only once and stored. The nodes are given a staggered start, so that their sequences don't overlap. Let the seed, or 0th random number, be denoted by . Using this as a seed, a sequential computer would compute the sequence:

$\displaystyle Y_0$ $\textstyle =$ $\displaystyle Y_0$ (6.4)
$\displaystyle Y_1$ $\textstyle =$ $\displaystyle (aY_0 + b) \bmod M$  
$\displaystyle Y_2$ $\textstyle =$ $\displaystyle (aY_1 + b) \bmod M$  
  $\textstyle \ldots$    

The seeds in the nodes are set as follows (subscripts denote position in the random number sequence, superscript denote processing node):

$\displaystyle X^0_0$ $\textstyle =$ $\displaystyle Y_0$ (6.5)
$\displaystyle X^1_0$ $\textstyle =$ $\displaystyle (aY_0 + b) \bmod M = Y_1$  
  $\textstyle \ldots$    
$\displaystyle X^{N-1}_0$ $\textstyle =$ $\displaystyle Y_{N-1}$  

This defines the staggered start. The node now use Eq. 6.2 to compute the next member of each of their sequences. Therefore

$\displaystyle X^0_1$ $\textstyle =$ $\displaystyle Y_{0+N}$ (6.6)
$\displaystyle X^1_1$ $\textstyle =$ $\displaystyle Y_{1+N}$  
  $\textstyle \ldots$    
$\displaystyle X^{N-1}_1$ $\textstyle =$ $\displaystyle Y_{N-1+N}$  

and,
$\displaystyle X^0_2$ $\textstyle =$ $\displaystyle Y_{0+2N}$ (6.7)
$\displaystyle X^1_2$ $\textstyle =$ $\displaystyle Y_{1+2N}$  
  $\textstyle \ldots$    
$\displaystyle X^{N-1}_2$ $\textstyle =$ $\displaystyle Y_{N-1+2N}$  

and so on.

If the seeds are set up properly (staggered) the processors leapfrog over one another, and it is just as good as having used the basic algorithm on a sequential machine.


next up previous contents
Next: Monte Carlo Simulations of Up: Applications and Performance Previous: Applications and Performance   Contents
Bryan Carpenter 2004-06-09