next up previous
Next: Benchmark Results Up: A CFD Application Previous: A CFD Application

Discretization and numerical integration

The system of partial differential is discretized by a finite volume approach--see for example [7] or [11]. Space is divided into a series of quadrilateral (but not necessarily rectangular) cells labelled $(i,j)$. This reduces the PDEs to a large coupled system of ordinary differential equations. These are integrated by a variant of the well-known 4th order Runge Kutta scheme. A single time-step involves four stages like:

\begin{displaymath}
U'_{i,j} = U_{i,j} - \alpha \frac{\delta t}{\Omega_{i,j}} R_{i,j}(U)
\end{displaymath} (2)

where $\alpha$ is a fractional value characteristic of the scheme, and
\begin{displaymath}
R_{i,j}(U) = \sum_{\mbox{faces of cell}} (f\,\delta y - g\,\delta x)
\end{displaymath} (3)

Here $\Omega_{i,j}$ is the volume a cell and $\delta x$, $\delta y$ are coordinate differences between end-points of the face. Since the dependent variables and fluxes are defined at cell centers, their values at a cell face in equation 3 is are approximated as the average of the values from the two cells meeting at the face.

So at its most basic level the program for integrating the Euler equations consists of a series of steps like:

  1. Calculate $p$, $H$ from current $U$ (via equations of state).
  2. Calculate $f$ from $U$, $p$, $H$.
  3. Calculate $g$ from $U$, $p$, $H$.
  4. Calculate $R$ from $f$, $g$.
  5. Update $U$.
To parallelize in HPJava, the discretized field variables are naturally stored in distributed arrays. All the steps above become overall nests. As a relatively simple case, the operation to calculate $f$ (step 2) looks like:

\begin{displaymath}
\begin{minipage}[t]{\linewidth}\small\begin{verbatim}Stat...
...u ;
f [i, j].d = H [i, j] * u ;
}\end{verbatim}\end{minipage}\end{displaymath}

The four fields a, b, c, d of Statevector correspond to the four conserved densities. A general observation is that the bodies of overall statements are now more complex than those in the (perhaps artificially simple) Poisson equation example of the previous section. We expect this will often happen in ``real'' applications. It is good for HPJava, because it means that various overheads associated with starting up a distributed loop are amortized better.

Another noteworthy thing is that these overall statements work naturally with aligned data--no communication is needed here. Out of the five stages enumerated above, only computation of R involves non-local terms (formally because of the use of averages across adjacent cells for the flow values at the faces). The code can be written easily using ghost regions, shifted indices, and the writeHalo() operation. Again it involves a single overall nest with a long body. A much-ellided outline is given in Figure 5. The optional arguments wlo, whi to Adlib.writeHalo() define the widths of the parts ghost regions that need updating (the default is to update the whole of the ghost regions of the array, whatever their width). In the current case these vectors both have value [1, 1]--because shifted indices displace one site in positive and negative x and y directions.

Figure 5: Outline of computation of $R$.
\begin{figure}\small\begin{verbatim}Adlib.writeHalo(f, wlo, whi) ;
Adlib.write...
...... Add similar contributions for S, W, N faces ...
}\end{verbatim}\end{figure}

The arrays xnode and ynode hold coordinates of the cell vertices. Because these arrays are constant through the computation, the ghost regions of these arrays are initialized once during startup.

We will briefly discuss two other interesting complications: handling of so-called artificial viscosity, and imposition of boundary conditions.

Artificial viscosity (or artifical smoothing) is added to damp out a numerical instability in the Runge Kutta time-stepping scheme, which otherwise causes unphysical oscillatory modes associated with the discretization to grow. An accepted scheme adds small terms proportional to 2nd and 4th order finite difference operators to the update of $U$. From the point of view of HPJava programming one interesting issue is that 4th order damping implies an update stencil requiring source elements offset two places from the destination element (unlike Figure 5, for example, where the maximum offset is one). This is handled by creating the $U$ array with ghost regions of width 2.

Implementing numerically stable boundary conditions for the Euler equations is non-trivial. In our implementation the domain of cells is rectangular, though the grid is distorted into an irregular pipe profile by the choice of physical coordinates attached to grid points (xnode, ynode distributed arrays). HPJava has an additional control construct called at, which can be used to update edges (it has other uses). The at statement is a degenerate form of the overall statement. It only ``enumerates'' a single location in its specified range. To work along the line $x = 0$, for example, one may write code like:

\begin{displaymath}
\begin{minipage}[t]{\linewidth}\small\begin{verbatim}at(i...
...n terms of U [i + 1, j], etc ...
}\end{verbatim}\end{minipage}\end{displaymath}

The actual code in the body is a fairly complicated interpolation based on Riemann invariants. In general access to U [i+1,j] here relies on ghost regions being up-to-date, exactly as for an index scoped by an overall statement.


next up previous
Next: Benchmark Results Up: A CFD Application Previous: A CFD Application
Bryan Carpenter 2004-04-24