Although partial differential equations (PDEs) are in focus, it can
for learning and test purposes be advantageous to have some simpler
models to work with.
Numerical differentiation
Given discrete values \( f(x_i) \) of a function \( f(x) \) at uniformly
distributed mesh points
\( x_i=i\Delta x \), \( i=0,\ldots,N_x \), we want to approximate
the derivative \( df/dx \) at the interior mesh points, \( f'(x_i) \),
\( i=1,\ldots,N_x-1 \), by a centered
finite difference:
$$ f'(x_i) = \frac{f(x_{i+1}) - f(x_{i-1})}{2\Delta x}.$$
At the end points we use one-sided differences:
$$ f'(x_0) = \frac{f(x_{1}) - f(x_{0})}{\Delta x},
\quad f'(x_{N_x}) = \frac{f(x_{N_x}) - f(x_{N_x-1})}{\Delta x} .$$
Serial implementations
A straightforward scalar implemention with explicit loops may look like
importnumpyasnpdefdifferentiate_scalar(f, a, b, n):
""" Compute the discrete derivative of a Python function f on [a,b] using n intervals. Internal points apply a centered difference, while end points apply a one-sided difference. """
x = np.linspace(a, b, n+1) # mesh
df = np.zeros_like(x) # df/dx
f_vec = f(x)
dx = x[1] - x[0]
# Internal mesh pointsfor i inrange(1, n):
df[i] = (f_vec[i+1] - f_vec[i-1])/(2*dx)
# End points
df[0] = (f_vec[1] - f_vec[0]) /dx
df[-1] = (f_vec[-1] - f_vec[-2])/dx
return df
A corresponding vectorized implementation takes the form
defdifferentiate_vec(f, a, b, n):
""" Compute the discrete derivative of a Python function f on [a,b] using n intervals. Internal points apply a centered difference, while end points apply a one-sided difference. Vectorized version. """
x = np.linspace(a, b, n+1) # mesh
df = np.zeros_like(x) # df/dx
f_vec = f(x)
dx = x[1] - x[0]
# Internal mesh points
df[1:-1] = (f_vec[2:] - f_vec[:-2])/(2*dx)
# End points
df[0] = (f_vec[1] - f_vec[0]) /dx
df[-1] = (f_vec[-1] - f_vec[-2])/dx
return df
Parallelization
Assume \( N_x \) is large and that we divide the mesh among processors.
Think of \( N_x=10 \) and three processors: processor 0 has \( x_0 \), \( x_1 \), \( x_2 \);
processor 1 has \( x_3 \), \( x_4 \), \( x_5 \); and processor 2 has the rest, \( x_6 \), \( x_7 \),
\( x_8 \), \( x_9 \), and \( x_{10} \). To compute the derivative at the three
points on processor 1 we need to access \( x_2 \) and \( x_6 \). We add these
ghost points to the set of local mesh points on this processor.
The other processors must also make use of ghost points.
Numerical integration
Integrals of a mathematical function \( f(x) \) can be approximated
by the trapezoidal rule:
$$ I = \int_a^b f(x)dx \approx \Delta x\left(\frac{1}{2}f(a) +
\frac{1}{2}f(b) + \sum_{j=1}^{N_x-1} f(a+j\Delta x)\right),$$
where \( \Delta x \) is the spacing between the \( N_x+1 \) evaluation (mesh) points:
\( \Delta x = (b-a)/N_x \).
Serial implementations
A straightforward scalar implemention with explicit loops may look like
importnumpyasnpdeftrapezoidal_scalar(f, a, b, n):
""" Compute the integral of f from a to b with n intervals, using the Trapezoidal rule. """
h = (b-a)/float(n)
I =0.5*(f(a) + f(b))
for i inrange(1, n):
x = a + i*h
I += f(x)
I = h*I
return I
A corresponding vectorized implementation takes the form
deftrapezoidal_vec(f, a, b, n):
""" Compute the integral of f from a to b with n intervals, using the Trapezoidal rule. Vectorized version. """
x = np.linspace(a, b, n+1)
f_vec = f(x)
f_vec[0] /=2.0
f_vec[-1] /=2.0
h = (b-a)/float(n)
I = h*np.sum(f_vec)
return I
Parallelization
We now divide the mesh points among the processors.
Each processor must sum its function values. The processors holding
the first and last mesh points needs to adjust the function value
at these points by a factor of one half. The one or all processors
must collect the partial sums and form the final sum.
Random walk
Perfectly parallelizable model (no communication before calculating
statistics of the concentration of walkers).
Demo programs from INF1100 can be used, but it is easier in 2D/3D
to vectorize and compute in general if the walkers go SE, SW, NE, NW
rather than N, S, E, W (i.e., the former can be computed by drawing
two/three independent random integers -1 or 1, one in each space
direction).
Key model problems
One of the most common time-consuming computing kernels when solving
partial differential equations or running image processing algorithms
is to move a (finite difference) stencil through a mesh.
High computational efficiency of this kernel is what we want to study.
A real physical problem where this kernel is basically the whole
computation, is finite difference solution of the wave equation
$$ u_{tt} = c^2\nabla^2 u + f,$$
where \( u \) is a function of space \( x \) and time \( t \), \( c \) is a constant,
and \( f \) is a function of \( x \) and \( t \).
The demo programs are therefore complete codes for solving such wave
equations.
It can be wise to use a progressive set of test problems:
Simplest possible 1D wave equation code, wave1D_u0.py
Simplest possible 2D wave equation code, wave2D_u0.py
Variable coefficient 3D code for \( u_{tt}=\nabla\cdot (c^2(x)\nabla u) + f \)
for real performance tests
PETSc is a linear algebra library in C (with Python bindings via petsc4py), which can
be used for parallel computing with vectors and matrices.
We started a project on exploring petsc4py to solve PDEs.
Disco for parallel implementation of
MapReduce algorithms. Once upon a time we started a tutorial for MapReduce in numerical computing.
Read parts of Finite difference methods for wave motion to get a background of the algorithms for the wave equation. This document also explains how to migrate parts of the Python code to Fortran or C as well as the Cython technology. These are complementary tools to what is studied in the present project.
Read about Numba and try it on differentiation and integration.