$$ \newcommand{\half}{\frac{1}{2}} \newcommand{\halfi}{{1/2}} \newcommand{\tp}{\thinspace .} \newcommand{\uex}{{u_{\small\mbox{e}}}} \newcommand{\xpoint}{\boldsymbol{x}} \newcommand{\normalvec}{\boldsymbol{n}} \newcommand{\Oof}[1]{\mathcal{O}(#1)} \renewcommand{\u}{\boldsymbol{u}} \newcommand{\f}{\boldsymbol{f}} \newcommand{\stress}{\boldsymbol{\sigma}} \newcommand{\I}{\boldsymbol{I}} \newcommand{\T}{\boldsymbol{T}} \newcommand{\ii}{\boldsymbol{i}} \newcommand{\jj}{\boldsymbol{j}} \newcommand{\kk}{\boldsymbol{k}} \newcommand{\Ix}{\mathcal{I}_x} \newcommand{\Iy}{\mathcal{I}_y} \newcommand{\Iz}{\mathcal{I}_z} \newcommand{\It}{\mathcal{I}_t} \newcommand{\setb}[1]{#1^0} % set begin \newcommand{\sete}[1]{#1^{-1}} % set end \newcommand{\setl}[1]{#1^-} \newcommand{\setr}[1]{#1^+} \newcommand{\seti}[1]{#1^i} \newcommand{\Real}{\mathbb{R}} \newcommand{\Integer}{\mathbb{Z}} $$

Implementation

We shall now describe in detail various Python implementations for solving a standard 2D, linear wave equation with constant wave velocity and \( u=0 \) on the boundary. The wave equation is to be solved in the space-time domain \( \Omega\times (0,T] \), where \( \Omega = (0,L_x)\times (0,L_y) \) is a rectangular spatial domain. More precisely, the complete initial-boundary value problem is defined by $$ \begin{align} u_{tt} &= c^2(u_{xx} + u_{yy}) + f(x,y,t),\quad &(x,y)\in \Omega,\ t\in (0,T], \tag{113}\\ u(x,y,0) &= I(x,y),\quad &(x,y)\in\Omega, \tag{114}\\ u_t(x,y,0) &= V(x,y),\quad &(x,y)\in\Omega, \tag{115}\\ u &= 0,\quad &(x,y)\in\partial\Omega,\ t\in (0,T], \tag{116} \end{align} $$ where \( \partial\Omega \) is the boundary of \( \Omega \), in this case the four sides of the rectangle \( \Omega = [0,L_x]\times [0,L_y] \): \( x=0 \), \( x=L_x \), \( y=0 \), and \( y=L_y \).

The PDE is discretized as $$ [D_t D_t u = c^2(D_xD_x u + D_yD_y u) + f]^n_{i,j}, $$ which leads to an explicit updating formula to be implemented in a program: $$ \begin{align} u^{n+1} &= -u^{n-1}_{i,j} + 2u^n_{i,j} + \nonumber\\ &\quad C_x^2( u^{n}_{i+1,j} - 2u^{n}_{i,j} + u^{n}_{i-1,j}) + C_y^2 (u^{n}_{i,j+1} - 2u^{n}_{i,j} + u^{n}_{i,j-1}) + \Delta t^2 f_{i,j}^n, \tag{117} \end{align} $$ for all interior mesh points \( i\in\seti{\Ix} \) and \( j\in\seti{\Iy} \), and for \( n\in\setr{\It} \). The constants \( C_x \) and \( C_y \) are defined as $$ C_x = c\frac{\Delta t}{\Delta x},\quad C_x = c\frac{\Delta t}{\Delta y} \tp $$

At the boundary, we simply set \( u^{n+1}_{i,j}=0 \) for \( i=0 \), \( j=0,\ldots,N_y \); \( i=N_x \), \( j=0,\ldots,N_y \); \( j=0 \), \( i=0,\ldots,N_x \); and \( j=N_y \), \( i=0,\ldots,N_x \). For the first step, \( n=0 \), (117) is combined with the discretization of the initial condition \( u_t=V \), \( [D_{2t} u = V]^0_{i,j} \) to obtain a special formula for \( u^1_{i,j} \) at the interior mesh points: $$ \begin{align} u^{1} &= u^0_{i,j} + \Delta t V_{i,j} + \nonumber\\ &\quad {\half}C_x^2( u^{0}_{i+1,j} - 2u^{0}_{i,j} + u^{0}_{i-1,j}) + {\half}C_y^2 (u^{0}_{i,j+1} - 2u^{0}_{i,j} + u^{0}_{i,j-1}) + \half\Delta t^2f_{i,j}^n, \tag{118} \end{align} $$

The algorithm is very similar to the one in 1D:

  1. Set initial condition \( u^0_{i,j}=I(x_i,y_j) \)
  2. Compute \( u^1_{i,j} \) from (117)
  3. Set \( u^1_{i,j}=0 \) for the boundaries \( i=0,N_x \), \( j=0,N_y \)
  4. For \( n=1,2,\ldots,N_t \):
    1. Find \( u^{n+1}_{i,j} \) from (117) for all internal mesh points, \( i\in\seti{\Ix} \), \( j\in\seti{\Iy} \)
    2. Set \( u^{n+1}_{i,j}=0 \) for the boundaries \( i=0,N_x \), \( j=0,N_y \)

Scalar computations

The solver function for a 2D case with constant wave velocity and boundary condition \( u=0 \) is analogous to the 1D case with similar parameter values (see wave1D_u0.py), apart from a few necessary extensions. The code is found in the program wave2D_u0.py.

Domain and mesh

The spatial domain is now \( [0,L_x]\times [0,L_y] \), specified by the arguments Lx and Ly. Similarly, the number of mesh points in the \( x \) and \( y \) directions, \( N_x \) and \( N_y \), become the arguments Nx and Ny. In multi-dimensional problems it makes less sense to specify a Courant number since the wave velocity is a vector and mesh spacings may differ in the various spatial directions. We therefore give \( \Delta t \) explicitly. The signature of the solver function is then

def solver(I, V, f, c, Lx, Ly, Nx, Ny, dt, T,
           user_action=None, version='scalar'):

Key parameters used in the calculations are created as

x = linspace(0, Lx, Nx+1)                  # mesh points in x dir
y = linspace(0, Ly, Ny+1)                  # mesh points in y dir
dx = x[1] - x[0]
dy = y[1] - y[0]
Nt = int(round(T/float(dt)))
t = linspace(0, N*dt, N+1)                 # mesh points in time
Cx2 = (c*dt/dx)**2;  Cy2 = (c*dt/dy)**2    # help variables
dt2 = dt**2

Solution arrays

We store \( u^{n+1}_{i,j} \), \( u^{n}_{i,j} \), and \( u^{n-1}_{i,j} \) in three two-dimensional arrays,

u   = zeros((Nx+1,Ny+1))   # solution array
u_1 = zeros((Nx+1,Ny+1))   # solution at t-dt
u_2 = zeros((Nx+1,Ny+1))   # solution at t-2*dt

where \( u^{n+1}_{i,j} \) corresponds to u[i,j], \( u^{n}_{i,j} \) to u_1[i,j], and \( u^{n-1}_{i,j} \) to u_2[i,j]

Index sets

It is also convenient to introduce the index sets (cf. The section Index set notation)

Ix = range(0, u.shape[0])
Iy = range(0, u.shape[1])
It = range(0, t.shape[0])

Computing the solution

Inserting the initial condition I in u_1 and making a callback to the user in terms of the user_action function is a straightforward generalization of the 1D code from the section Sketch of an implementation:

for i in Ix:
    for j in Iy:
        u_1[i,j] = I(x[i], y[j])

if user_action is not None:
    user_action(u_1, x, xv, y, yv, t, 0)

The user_action function has additional arguments compared to the 1D case. The arguments xv and yv will be commented upon in the section Vectorized computations.

The key finite difference formula (110) for updating the solution at a time level is implemented in a separate function as

def advance_scalar(u, u_1, u_2, f, x, y, t, n, Cx2, Cy2, dt2,
                   V=None, step1=False):
    Ix = range(0, u.shape[0]);  Iy = range(0, u.shape[1])
    if step1:
        dt = sqrt(dt2)  # save
        Cx2 = 0.5*Cx2;  Cy2 = 0.5*Cy2; dt2 = 0.5*dt2  # redefine
        D1 = 1;  D2 = 0
    else:
        D1 = 2;  D2 = 1
    for i in Ix[1:-1]:
        for j in Iy[1:-1]:
            u_xx = u_1[i-1,j] - 2*u_1[i,j] + u_1[i+1,j]
            u_yy = u_1[i,j-1] - 2*u_1[i,j] + u_1[i,j+1]
            u[i,j] = D1*u_1[i,j] - D2*u_2[i,j] + \ 
                     Cx2*u_xx + Cy2*u_yy + dt2*f(x[i], y[j], t[n])
            if step1:
                u[i,j] += dt*V(x[i], y[j])
    # Boundary condition u=0
    j = Iy[0]
    for i in Ix: u[i,j] = 0
    j = Iy[-1]
    for i in Ix: u[i,j] = 0
    i = Ix[0]
    for j in Iy: u[i,j] = 0
    i = Ix[-1]
    for j in Iy: u[i,j] = 0
    return u

The step1 variable has been introduced to allow the formula to be reused for first step \( u^1_{i,j} \):

u = advance_scalar(u, u_1, u_2, f, x, y, t,
                   n, Cx2, Cy2, dt, V, step1=True)

Below, we will make many alternative implementations of the advance_scalar function to speed up the code since most of the CPU time in simulations is spent in this function.

Finally, we remark that the solver function in the wave2D_u0.py code updates arrays for the next time step by switching references as described in the section Remark on the updating of arrays. If the solution u is returned from solver, which it is not, it is important to set u = u_1 after the time loop, otherwise u actually contains u_2.

Vectorized computations

The scalar code above turns out to be extremely slow for large 2D meshes, and probably useless in 3D beyond debugging of small test cases. Vectorization is therefore a must for multi-dimensional finite difference computations in Python. For example, with a mesh consisting of \( 30\times 30 \) cells, vectorization brings down the CPU time by a factor of 70 (!).

In the vectorized case, we must be able to evaluate user-given functions like \( I(x,y) \) and \( f(x,y,t) \) for the entire mesh in one operation (without loops). These user-given functions are provided as Python functions I(x,y) and f(x,y,t), respectively. Having the one-dimensional coordinate arrays x and y is not sufficient when calling I and f in a vectorized way. We must extend x and y to their vectorized versions xv and yv:

from numpy import newaxis
xv = x[:,newaxis]
yv = y[newaxis,:]
# or
xv = x.reshape((x.size, 1))
yv = y.reshape((1, y.size))

This is a standard required technique when evaluating functions over a 2D mesh, say sin(xv)*cos(xv), which then gives a result with shape (Nx+1,Ny+1). Calling I(xv, yv) and f(xv, yv, t[n]) will now return I and f values for the entire set of mesh points.

With the xv and yv arrays for vectorized computing, setting the initial condition is just a matter of

u_1[:,:] = I(xv, yv)

One could also have written u_1 = I(xv, yv) and let u_1 point to a new object, but vectorized operations often make use of direct insertion in the original array through u_1[:,:], because sometimes not all of the array is to be filled by such a function evaluation. This is the case with the computational scheme for \( u^{n+1}_{i,j} \):

def advance_vectorized(u, u_1, u_2, f_a, Cx2, Cy2, dt2,
                       V=None, step1=False):
    if step1:
        dt = sqrt(dt2)  # save
        Cx2 = 0.5*Cx2;  Cy2 = 0.5*Cy2; dt2 = 0.5*dt2  # redefine
        D1 = 1;  D2 = 0
    else:
        D1 = 2;  D2 = 1
    u_xx = u_1[:-2,1:-1] - 2*u_1[1:-1,1:-1] + u_1[2:,1:-1]
    u_yy = u_1[1:-1,:-2] - 2*u_1[1:-1,1:-1] + u_1[1:-1,2:]
    u[1:-1,1:-1] = D1*u_1[1:-1,1:-1] - D2*u_2[1:-1,1:-1] + \ 
                   Cx2*u_xx + Cy2*u_yy + dt2*f_a[1:-1,1:-1]
    if step1:
        u[1:-1,1:-1] += dt*V[1:-1, 1:-1]
    # Boundary condition u=0
    j = 0
    u[:,j] = 0
    j = u.shape[1]-1
    u[:,j] = 0
    i = 0
    u[i,:] = 0
    i = u.shape[0]-1
    u[i,:] = 0
    return u

def quadratic(Nx, Ny, version):
    """Exact discrete solution of the scheme."""

    def exact_solution(x, y, t):
        return x*(Lx - x)*y*(Ly - y)*(1 + 0.5*t)

    def I(x, y):
        return exact_solution(x, y, 0)

    def V(x, y):
        return 0.5*exact_solution(x, y, 0)

    def f(x, y, t):
        return 2*c**2*(1 + 0.5*t)*(y*(Ly - y) + x*(Lx - x))

    Lx = 5;  Ly = 2
    c = 1.5
    dt = -1 # use longest possible steps
    T = 18

    def assert_no_error(u, x, xv, y, yv, t, n):
        u_e = exact_solution(xv, yv, t[n])
        diff = abs(u - u_e).max()
        tol = 1E-12
        msg = 'diff=%g, step %d, time=%g' % (diff, n, t[n])
        assert diff < tol, msg

    new_dt, cpu = solver(
        I, V, f, c, Lx, Ly, Nx, Ny, dt, T,
        user_action=assert_no_error, version=version)
    return new_dt, cpu


def test_quadratic():
    # Test a series of meshes where Nx > Ny and Nx < Ny
    versions = 'scalar', 'vectorized', 'cython', 'f77', 'c_cy', 'c_f2py'
    for Nx in range(2, 6, 2):
        for Ny in range(2, 6, 2):
            for version in versions:
                print 'testing', version, 'for %dx%d mesh' % (Nx, Ny)
                quadratic(Nx, Ny, version)

def run_efficiency(nrefinements=4):
    def I(x, y):
        return sin(pi*x/Lx)*sin(pi*y/Ly)

    Lx = 10;  Ly = 10
    c = 1.5
    T = 100
    versions = ['scalar', 'vectorized', 'cython', 'f77',
               'c_f2py', 'c_cy']
    print ' '*15, ''.join(['%-13s' % v for v in versions])
    for Nx in 15, 30, 60, 120:
        cpu = {}
        for version in versions:
            dt, cpu_ = solver(I, None, None, c, Lx, Ly, Nx, Nx,
                              -1, T, user_action=None,
                              version=version)
            cpu[version] = cpu_
        cpu_min = min(list(cpu.values()))
        if cpu_min < 1E-6:
            print 'Ignored %dx%d grid (too small execution time)' \ 
                  % (Nx, Nx)
        else:
            cpu = {version: cpu[version]/cpu_min for version in cpu}
            print '%-15s' % '%dx%d' % (Nx, Nx),
            print ''.join(['%13.1f' % cpu[version] for version in versions])

def gaussian(plot_method=2, version='vectorized', save_plot=True):
    """
    Initial Gaussian bell in the middle of the domain.
    plot_method=1 applies mesh function, =2 means surf, =0 means no plot.
    """
    # Clean up plot files
    for name in glob('tmp_*.png'):
        os.remove(name)

    Lx = 10
    Ly = 10
    c = 1.0

    def I(x, y):
        """Gaussian peak at (Lx/2, Ly/2)."""
        return exp(-0.5*(x-Lx/2.0)**2 - 0.5*(y-Ly/2.0)**2)

    if plot_method == 3:
        from mpl_toolkits.mplot3d import axes3d
        import matplotlib.pyplot as plt
        from matplotlib import cm
        plt.ion()
        fig = plt.figure()
        u_surf = None

    def plot_u(u, x, xv, y, yv, t, n):
        if t[n] == 0:
            time.sleep(2)
        if plot_method == 1:
            mesh(x, y, u, title='t=%g' % t[n], zlim=[-1,1],
                 caxis=[-1,1])
        elif plot_method == 2:
            surfc(xv, yv, u, title='t=%g' % t[n], zlim=[-1, 1],
                  colorbar=True, colormap=hot(), caxis=[-1,1],
                  shading='flat')
        elif plot_method == 3:
            print 'Experimental 3D matplotlib...under development...'
            #plt.clf()
            ax = fig.add_subplot(111, projection='3d')
            u_surf = ax.plot_surface(xv, yv, u, alpha=0.3)
            #ax.contourf(xv, yv, u, zdir='z', offset=-100, cmap=cm.coolwarm)
            #ax.set_zlim(-1, 1)
            # Remove old surface before drawing
            if u_surf is not None:
                ax.collections.remove(u_surf)
            plt.draw()
            time.sleep(1)
        if plot_method > 0:
            time.sleep(0) # pause between frames
            if save_plot:
                filename = 'tmp_%04d.png' % n
                savefig(filename)  # time consuming!

    Nx = 40; Ny = 40; T = 20
    dt, cpu = solver(I, None, None, c, Lx, Ly, Nx, Ny, -1, T,
                     user_action=plot_u, version=version)



if __name__ == '__main__':
    test_quadratic()

Array slices in 2D are more complicated to understand than those in 1D, but the logic from 1D applies to each dimension separately. For example, when doing \( u^{n}_{i,j} - u^{n}_{i-1,j} \) for \( i\in\setr{\Ix} \), we just keep j constant and make a slice in the first index: u_1[1:,j] - u_1[:-1,j], exactly as in 1D. The 1: slice specifies all the indices \( i=1,2,\ldots,N_x \) (up to the last valid index), while :-1 specifies the relevant indices for the second term: \( 0,1,\ldots,N_x-1 \) (up to, but not including the last index).

In the above code segment, the situation is slightly more complicated, because each displaced slice in one direction is accompanied by a 1:-1 slice in the other direction. The reason is that we only work with the internal points for the index that is kept constant in a difference.

The boundary conditions along the four sides makes use of a slice consisting of all indices along a boundary:

u[: ,0] = 0
u[:,Ny] = 0
u[0 ,:] = 0
u[Nx,:] = 0

In the vectorized update of u (above), the function f is first computed as an array over all mesh points:

f_a = f(xv, yv, t[n])

We could, alternatively, have used the call f(xv, yv, t[n])[1:-1,1:-1] in the last term of the update statement, but other implementations in compiled languages benefit from having f available in an array rather than calling our Python function f(x,y,t) for every point.

Also in the advance_vectorized function we have introduced a boolean step1 to reuse the formula for the first time step in the same way as we did with advance_scalar. We refer to the solver function in wave2D_u0.py for the details on how the overall algorithm is implemented.

The callback function now has the arguments u, x, xv, y, yv, t, n. The inclusion of xv and yv makes it easy to, e.g., compute an exact 2D solution in the callback function and compute errors, through an expression like u - u_exact(xv, yv, t[n]).

Verification

Testing a quadratic solution

The 1D solution from the section Constructing an exact solution of the discrete equations can be generalized to multi-dimensions and provides a test case where the exact solution also fulfills the discrete equations, such that we know (to machine precision) what numbers the solver function should produce. In 2D we use the following generalization of (30): $$ \begin{equation} \uex(x,y,t) = x(L_x-x)y(L_y-y)(1+{\half}t) \tp \tag{119} \end{equation} $$ This solution fulfills the PDE problem if \( I(x,y)=\uex(x,y,0) \), \( V=\half\uex(x,y,0) \), and \( f=2c^2(1+{\half}t)(y(L_y-y) + x(L_x-x)) \). To show that \( \uex \) also solves the discrete equations, we start with the general results \( [D_t D_t 1]^n=0 \), \( [D_t D_t t]^n=0 \), and \( [D_t D_t t^2]=2 \), and use these to compute $$ \begin{align*} [D_xD_x \uex]^n_{i,j} &= [y(L_y-y)(1+{\half}t) D_xD_x x(L_x-x)]^n_{i,j}\\ &= y_j(L_y-y_j)(1+{\half}t_n)(-2)\tp \end{align*} $$ A similar calculation must be carried out for the \( [D_yD_y \uex]^n_{i,j} \) and \( [D_tD_t \uex]^n_{i,j} \) terms. One must also show that the quadratic solution fits the special formula for \( u^1_{i,j} \). The details are left as Exercise 15: Check that a solution fulfills the discrete model. The test_quadratic function in the wave2D_u0.py program implements this verification as a nose test.

Using classes to implement a simulator

Exercises

Exercise 15: Check that a solution fulfills the discrete model

Carry out all mathematical details to show that (119) is indeed a solution of the discrete model for a 2D wave equation with \( u=0 \) on the boundary. One must check the boundary conditions, the initial conditions, the general discrete equation at a time level and the special version of this equation for the first time level. Filename: check_quadratic_solution.

Project 16: Calculus with 2D mesh functions

The goal of this project is to redo Project 5: Calculus with 1D mesh functions with 2D mesh functions (\( f_{i,j} \)).

Differentiation. The differentiation results in a discrete gradient function, which in the 2D case can be represented by a three-dimensional array df[d,i,j] where d represents the direction of the derivative, and i,j is a mesh point in 2D. Use centered differences for the derivative at inner points and one-sided forward or backward differences at the boundary points. Construct unit tests and write a corresponding test function.

Integration. The integral of a 2D mesh function \( f_{i,j} \) is defined as $$ F_{i,j} = \int_{y_0}^{y_j} \int_{x_0}^{x_i} f(x,y)dxdy,$$ where \( f(x,y) \) is a function that takes on the values of the discrete mesh function \( f_{i,j} \) at the mesh points, but can also be evaluated in between the mesh points. The particular variation between mesh points can be taken as bilinear, but this is not important as we will use a product Trapezoidal rule to approximate the integral over a cell in the mesh and then we only need to evaluate \( f(x,y) \) at the mesh points.

Suppose \( F_{i,j} \) is computed. The calculation of \( F_{i+1,j} \) is then $$ \begin{align*} F_{i+1,j} &= F_{i,j} + \int_{x_i}^{x_{i+1}}\int_{y_0}^{y_j} f(x,y)dydx\\ & \approx \Delta x \half\left( \int_{y_0}^{y_j} f(x_{i},y)dy + \int_{y_0}^{y_j} f(x_{i+1},y)dy\right) \end{align*} $$ The integrals in the \( y \) direction can be approximated by a Trapezoidal rule. A similar idea can be used to compute \( F_{i,j+1} \). Thereafter, \( F_{i+1,j+1} \) can be computed by adding the integral over the final corner cell to \( F_{i+1,j} + F_{i,j+1} - F_{i,j} \). Carry out the details of these computations and implement a function that can return \( F_{i,j} \) for all mesh indices \( i \) and \( j \). Use the fact that the Trapezoidal rule is exact for linear functions and write a test function. Filename: mesh_calculus_2D.

Exercise 17: Implement Neumann conditions in 2D

Modify the wave2D_u0.py program, which solves the 2D wave equation \( u_{tt}=c^2(u_{xx}+u_{yy}) \) with constant wave velocity \( c \) and \( u=0 \) on the boundary, to have Neumann boundary conditions: \( \partial u/\partial n=0 \). Include both scalar code (for debugging and reference) and vectorized code (for speed).

To test the code, use \( u=1.2 \) as solution (\( I(x,y)=1.2 \), \( V=f=0 \), and \( c \) arbitrary), which should be exactly reproduced with any mesh as long as the stability criterion is satisfied. Another test is to use the plug-shaped pulse in the pulse function from the section Building a general 1D wave equation solver and the wave1D_dn_vc.py program. This pulse is exactly propagated in 1D if \( c\Delta t/\Delta x=1 \). Check that also the 2D program can propagate this pulse exactly in \( x \) direction (\( c\Delta t/\Delta x=1 \), \( \Delta y \) arbitrary) and \( y \) direction (\( c\Delta t/\Delta y=1 \), \( \Delta x \) arbitrary). Filename: wave2D_dn.

Exercise 18: Test the efficiency of compiled loops in 3D

Extend the wave2D_u0.py code and the Cython, Fortran, and C versions to 3D. Set up an efficiency experiment to determine the relative efficiency of pure scalar Python code, vectorized code, Cython-compiled loops, Fortran-compiled loops, and C-compiled loops. Normalize the CPU time for each mesh by the fastest version. Filename: wave3D_u0.