Analysis of finite difference equations

We address the ODE for exponential decay,

\[u'(t) = -au(t),\quad u(0)=I,\]

where \(a\) and \(I\) are given constants. This problem is solved by the \(\theta\)-rule finite difference scheme, resulting in the recursive equations

(1)\[ u^{n+1} = \frac{1 - (1-\theta) a\Delta t}{1 + \theta a\Delta t}u^n\]

for the numerical solution \(u^{n+1}\), which approximates the exact solution \({u_{\small\mbox{e}}}\) at time point \(t_{n+1}\). For constant mesh spacing, which we assume here, \(t_{n+1}=(n+1)\Delta t\).

Discouraging numerical solutions. Choosing \(I=1\), \(a=2\), and running experiments with \(\theta =1,0.5, 0\) for \(\Delta t=1.25, 0.75, 0.5, 0.1\), gives the results in Figures Backward Euler, Crank-Nicolson, and Forward Euler.

_images/BE4c.png

Backward Euler

_images/CN4c.png

Crank-Nicolson

_images/FE4c.png

Forward Euler

The characteristics of the displayed curves can be summarized as follows:

  • The Backward Euler scheme always gives a monotone solution, lying above the exact curve.
  • The Crank-Nicolson scheme gives the most accurate results, but for \(\Delta t=1.25\) the solution oscillates.
  • The Forward Euler scheme gives a growing, oscillating solution for \(\Delta t=1.25\); a decaying, oscillating solution for \(\Delta t=0.75\); a strange solution \(u^n=0\) for \(n\geq 1\) when \(\Delta t=0.5\); and a solution seemingly as accurate as the one by the Backward Euler scheme for \(\Delta t = 0.1\), but the curve lies below the exact solution.

Since the exact solution of our model problem is a monotone function, \(u(t)=Ie^{-at}\), some of these qualitatively wrong results are indeed alarming!

Goal

We ask the question

  • Under what circumstances, i.e., values of the input data \(I\), \(a\), and \(\Delta t\) will the Forward Euler and Crank-Nicolson schemes result in undesired oscillatory solutions?

The question will be investigated both by numerical experiments and by precise mathematical theory. The latter will help establish general critera on \(\Delta t\) for avoiding non-physical oscillatory or growing solutions.

Another question to be raised is

  • How does \(\Delta t\) impact the error in the numerical solution?

For our simple model problem we can answer this question very precisely, but we will also look at simplified formulas for small \(\Delta t\) and touch upon important concepts such as convergence rate and the order of a scheme. Other fundamental concepts mentioned are stability, consistency, and convergence.

Experimental investigation of oscillatory solutions

To address the first question above, we may set up an experiment where we loop over values of \(I\), \(a\), and \(\Delta t\). For each experiment, we flag the solution as oscillatory if

\[\begin{split}u^{n} > u^{n-1},\end{split}\]

for some value of \(n\), since we expect \(u^n\) to decay with \(n\), but oscillations make \(u\) increase over a time step. We will quickly see that oscillations are independent of \(I\), but do depend on \(a\) and \(\Delta t\). Therefore, we introduce a two-dimensional function \(B(a,\Delta t)\) which is 1 if oscillations occur and 0 otherwise. We can visualize \(B\) as a contour plot (lines for which \(B=\hbox{const}\)). The contour \(B=0.5\) corresponds to the borderline between oscillatory regions with \(B=1\) and monotone regions with \(B=0\) in the \(a,\Delta t\) plane.

The \(B\) function is defined at discrete \(a\) and \(\Delta t\) values. Say we have given \(P\) $a$ values, \(a_0,\ldots,a_{P-1}\), and \(Q\) $Delta t$ values, \(\Delta t_0,\ldots,\Delta t_{Q-1}\). These \(a_i\) and \(\Delta t_j\) values, \(i=0,\ldots,P-1\), \(j=0,\ldots,Q-1\), form a rectangular mesh of \(P\times Q\) points in the plane. At each point \((a_i, \Delta t_j)\), we associate the corresponding value of \(B(a_i,\Delta t_j)\), denoted \(B_{ij}\). The \(B_{ij}\) values are naturally stored in a two-dimensional array. We can thereafter create a plot of the contour line \(B_{ij}=0.5\) dividing the oscillatory and monotone regions. The file decay_osc_regions.py osc_regions stands for “oscillatory regions”) contains all nuts and bolts to produce the \(B=0.5\) line in Figures Forward Euler scheme: oscillatory solutions occur for points above the curve and Crank-Nicolson scheme: oscillatory solutions occur for points above the curve. The oscillatory region is above this line.

from decay_mod import solver
import numpy as np
import scitools.std as st

def non_physical_behavior(I, a, T, dt, theta):
    """
    Given lists/arrays a and dt, and numbers I, dt, and theta,
    make a two-dimensional contour line B=0.5, where B=1>0.5
    means oscillatory (unstable) solution, and B=0<0.5 means
    monotone solution of u'=-au.
    """
    a = np.asarray(a); dt = np.asarray(dt)  # must be arrays
    B = np.zeros((len(a), len(dt)))         # results
    for i in range(len(a)):
        for j in range(len(dt)):
            u, t = solver(I, a[i], T, dt[j], theta)
            # Does u have the right monotone decay properties?
            correct_qualitative_behavior = True
            for n in range(1, len(u)):
                if u[n] > u[n-1]:  # Not decaying?
                    correct_qualitative_behavior = False
                    break  # Jump out of loop
            B[i,j] = float(correct_qualitative_behavior)
    a_, dt_ = st.ndgrid(a, dt)  # make mesh of a and dt values
    st.contour(a_, dt_, B, 1)
    st.grid('on')
    st.title('theta=%g' % theta)
    st.xlabel('a'); st.ylabel('dt')
    st.savefig('osc_region_theta_%s.png' % theta)
    st.savefig('osc_region_theta_%s.pdf' % theta)

non_physical_behavior(
    I=1,
    a=np.linspace(0.01, 4, 22),
    dt=np.linspace(0.01, 4, 22),
    T=6,
    theta=0.5)
_images/osc_region_FE.png

Forward Euler scheme: oscillatory solutions occur for points above the curve

_images/osc_region_CN.png

Crank-Nicolson scheme: oscillatory solutions occur for points above the curve

By looking at the curves in the figures one may guess that \(a\Delta t\) must be less than a critical limit to avoid the undesired oscillations. This limit seems to be about 2 for Crank-Nicolson and 1 for Forward Euler. We shall now establish a precise mathematical analysis of the discrete model that can explain the observations in our numerical experiments.

Exact numerical solution

Starting with \(u^0=I\), the simple recursion (1) can be applied repeatedly \(n\) times, with the result that

(2)\[ u^{n} = IA^n,\quad A = \frac{1 - (1-\theta) a\Delta t}{1 + \theta a\Delta t}{\thinspace .}\]

Solving difference equations

Difference equations where all terms are linear in \(u^{n+1}\), \(u^n\), and maybe \(u^{n-1}\), \(u^{n-2}\), etc., are called homogeneous, linear difference equations, and their solutions are generally of the form \(u^n=A^n\). Inserting this expression and dividing by \(A^{n+1}\) gives a polynomial equation in \(A\). In the present case we get

\[A = \frac{1 - (1-\theta) a\Delta t}{1 + \theta a\Delta t}{\thinspace .}\]

This is a solution technique of wider applicability than repeated use of the recursion (1).

Regardless of the solution approach, we have obtained a formula for \(u^n\). This formula can explain everything what we see in the figures above, but it also gives us a more general insight into accuracy and stability properties of the three schemes.

Stability

Since \(u^n\) is a factor \(A\) raised to an integer power \(n\), we realize that \(A<0\) will for odd powers imply \(u^n<0\) and for even power result in \(u^n>0\). That is, the solution oscillates between the mesh points. We have oscillations due to \(A<0\) when

(3)\[\begin{split} (1-\theta)a\Delta t > 1 {\thinspace .}\end{split}\]

Since \(A>0\) is a requirement for having a numerical solution with the same basic property (monotonicity) as the exact solution, we may say that \(A>0\) is a stability criterion. Expressed in terms of \(\Delta t\) the stability criterion reads

\[\begin{split}\Delta t < \frac{1}{(1-\theta)a}{\thinspace .}\end{split}\]

The Backward Euler scheme is always stable since \(A<0\) is impossible for \(\theta=1\), while non-oscillating solutions for Forward Euler and Crank-Nicolson demand \(\Delta t\leq 1/a\) and \(\Delta t\leq 2/a\), respectively. The relation between \(\Delta t\) and \(a\) look reasonable: a larger \(a\) means faster decay and hence a need for smaller time steps.

Looking at Figure Forward Euler, we see that with \(a\Delta t= 2\cdot 1.25=2.5\), \(A=-1.5\), and the solution \(u^n=(-1.5)^n\) oscillates and grows. With \(a\Delta t = 2\cdot 0.75=1.5\), \(A=-0.5\), \(u^n=(-0.5)^n\) decays but oscillates. The peculiar case \(\Delta t = 0.5\), where the Forward Euler scheme produces a solution that is stuck on the \(t\) axis, corresponds to \(A=0\) and therefore \(u^0=I=1\) and \(u^n=0\) for \(n\geq 1\). The decaying oscillations in the Crank-Nicolson scheme for \(\Delta t=1.25\) are easily explained by the fact that \(A\approx -0.11<0\).

The factor \(A\) is called the amplification factor since the solution at a new time level is \(A\) times the solution at the previous time level. For a decay process, we must obviously have \(|A|\leq 1\), which is fulfilled for all \(\Delta t\) if \(\theta \geq 1/2\). Arbitrarily large values of \(u\) can be generated when \(|A|>1\) and \(n\) is large enough. The numerical solution is in such cases totally irrelevant to an ODE modeling decay processes! To avoid this situation, we must for \(\theta < 1/2\) have

\[\Delta t \leq \frac{2}{(1-2\theta)a},\]

which means \(\Delta t < 2/a\) for the Forward Euler scheme.

Stability properties

We may summarize the stability investigations as follows:

  1. The Forward Euler method is a conditionally stable scheme because it requires \(\Delta t < 2/a\) for avoiding growing solutions and \(\Delta t < 1/a\) for avoiding oscillatory solutions.
  2. The Crank-Nicolson is unconditionally stable with respect to growing solutions, while it is conditionally stable with the criterion \(\Delta t < 2/a\) for avoiding oscillatory solutions.
  3. The Backward Euler method is unconditionally stable with respect to growing and oscillatory solutions - any \(\Delta t\) will work.

Much literature on ODEs speaks about L-stable and A-stable methods. In our case A-stable methods ensures non-growing solutions, while L-stable methods also avoids oscillatory solutions.

Comparing amplification factors

After establishing how \(A\) impacts the qualitative features of the solution, we shall now look more into how well the numerical amplification factor approximates the exact one. The exact solution reads \(u(t)=Ie^{-at}\), which can be rewritten as

\[{{u_{\small\mbox{e}}}}(t_n) = Ie^{-a n\Delta t} = I(e^{-a\Delta t})^n {\thinspace .}\]

From this formula we see that the exact amplification factor is

\[{A_{\small\mbox{e}}} = e^{-a\Delta t} {\thinspace .}\]

We realize that the exact and numerical amplification factors depend on \(a\) and \(\Delta t\) through the product \(a\Delta t\). Therefore, it is convenient to introduce a symbol for this product, \(p=a\Delta t\), and view \(A\) and \({A_{\small\mbox{e}}}\) as functions of \(p\). Figure Comparison of amplification factors shows these functions. Crank-Nicolson is clearly closest to the exact amplification factor, but that method has the unfortunate oscillatory behavior when \(p>2\).

_images/A_factors.png

Comparison of amplification factors

Series expansion of amplification factors

As an alternative to the visual understanding inherent in Figure Comparison of amplification factors, there is a strong tradition in numerical analysis to establish formulas for the approximation errors when the discretization parameter, here \(\Delta t\), becomes small. In the present case we let \(p\) be our small discretization parameter, and it makes sense to simplify the expressions for \(A\) and \({A_{\small\mbox{e}}}\) by using Taylor polynomials around \(p=0\). The Taylor polynomials are accurate for small \(p\) and greatly simplifies the comparison of the analytical expressions since we then can compare polynomials, term by term.

Calculating the Taylor series for \({A_{\small\mbox{e}}}\) is easily done by hand, but the three versions of \(A\) for \(\theta=0,1,{\frac{1}{2}}\) lead to more cumbersome calculations. Nowadays, analytical computations can benefit greatly by symbolic computer algebra software. The Python package sympy represents a powerful computer algebra system, not yet as sophisticated as the famous Maple and Mathematica systems, but free and very easy to integrate with our numerical computations in Python.

When using sympy, it is convenient to enter the interactive Python mode where we can write expressions and statements and immediately see the results. Here is a simple example. We strongly recommend to use isympy (or ipython) for such interactive sessions.

Let us illustrate sympy with a standard Python shell syntax (>>> prompt) to compute a Taylor polynomial approximation to \(e^{-p}\):

>>> from sympy import *
>>> # Create p as a mathematical symbol with name 'p'
>>> p = Symbol('p')
>>> # Create a mathematical expression with p
>>> A_e = exp(-p)
>>>
>>> # Find the first 6 terms of the Taylor series of A_e
>>> A_e.series(p, 0, 6)
1 + (1/2)*p**2 - p - 1/6*p**3 - 1/120*p**5 + (1/24)*p**4 + O(p**6)

Lines with >>> represent input lines and lines without this prompt represents the result of computations (note that isympy and ipython apply other prompts, but in this text we always apply >>> for interactive Python computing). Apart from the order of the powers, the computed formula is easily recognized as the beginning of the Taylor series for \(e^{-p}\).

Let us define the numerical amplification factor where \(p\) and \(\theta\) enter the formula as symbols:

>>> theta = Symbol('theta')
>>> A = (1-(1-theta)*p)/(1+theta*p)

To work with the factor for the Backward Euler scheme we can substitute the value 1 for theta:

>>> A.subs(theta, 1)
1/(1 + p)

Similarly, we can replace theta by 1/2 for Crank-Nicolson, preferably using an exact rational representation of 1/2 in sympy:

>>> half = Rational(1,2)
>>> A.subs(theta, half)
1/(1 + (1/2)*p)*(1 - 1/2*p)

The Taylor series of the amplification factor for the Crank-Nicolson scheme can be computed as

>>> A.subs(theta, half).series(p, 0, 4)
1 + (1/2)*p**2 - p - 1/4*p**3 + O(p**4)

We are now in a position to compare Taylor series:

>>> FE = A_e.series(p, 0, 4) - A.subs(theta, 0).series(p, 0, 4)
>>> BE = A_e.series(p, 0, 4) - A.subs(theta, 1).series(p, 0, 4)
>>> CN = A_e.series(p, 0, 4) - A.subs(theta, half).series(p, 0, 4 )
>>> FE
(1/2)*p**2 - 1/6*p**3 + O(p**4)
>>> BE
-1/2*p**2 + (5/6)*p**3 + O(p**4)
>>> CN
(1/12)*p**3 + O(p**4)

From these expressions we see that the error \(A-{A_{\small\mbox{e}}}\sim {\mathcal{O}(p^2)}\) for the Forward and Backward Euler schemes, while \(A-{A_{\small\mbox{e}}}\sim {\mathcal{O}(p^3)}\) for the Crank-Nicolson scheme. It is the leading order term, i.e., the term of the lowest order (polynomial degree), that is of interest, because as \(p\rightarrow 0\), this term is (much) bigger than the higher-order terms (think of \(p=0.01\): \(p\) is a hundred times larger than \(p^2\)).

Now, \(a\) is a given parameter in the problem, while \(\Delta t\) is what we can vary. One therefore usually writes the error expressions in terms \(\Delta t\). When then have

\[\begin{split}A-{A_{\small\mbox{e}}} = \left\lbrace\begin{array}{ll} {\mathcal{O}(\Delta t^2)}, & \hbox{Forward and Backward Euler},\\ {\mathcal{O}(\Delta t^3)}, & \hbox{Crank-Nicolson} \end{array}\right.\end{split}\]

We say that the Crank-Nicolson scheme has an error in the amplification factor of order \(\Delta t^3\), while the two other schemes are of order \(\Delta t^2\) in the same quantity. What is the significance of the order expression? If we halve \(\Delta t\), the error in amplification factor at a time level will be reduced by a factor of 4 in the Forward and Backward Euler schemes, and by a factor of 8 in the Crank-Nicolson scheme. That is, as we reduce \(\Delta t\) to obtain more accurate results, the Crank-Nicolson scheme reduces the error more efficiently than the other schemes.

The fraction of numerical and exact amplification factors

An alternative comparison of the schemes is to look at the ratio \(A/{A_{\small\mbox{e}}}\), or the error \(1-A/{A_{\small\mbox{e}}}\) in this ratio:

>>> FE = 1 - (A.subs(theta, 0)/A_e).series(p, 0, 4)
>>> BE = 1 - (A.subs(theta, 1)/A_e).series(p, 0, 4)
>>> CN = 1 - (A.subs(theta, half)/A_e).series(p, 0, 4)
>>> FE
(1/2)*p**2 + (1/3)*p**3 + O(p**4)
>>> BE
-1/2*p**2 + (1/3)*p**3 + O(p**4)
>>> CN
(1/12)*p**3 + O(p**4)

The leading-order terms have the same powers as in the analysis of \(A-{A_{\small\mbox{e}}}\).

The global error at a point

The error in the amplification factor reflects the error when progressing from time level \(t_n\) to \(t_{n-1}\). To investigate the real error at a point, known as the global error, we look at \(e^n = u^n-{u_{\small\mbox{e}}}(t_n)\) for some \(n\) and Taylor expand the mathematical expressions as functions of \(p=a\Delta t\):

>>> n = Symbol('n')
>>> u_e = exp(-p*n)
>>> u_n = A**n
>>> FE = u_e.series(p, 0, 4) - u_n.subs(theta, 0).series(p, 0, 4)
>>> BE = u_e.series(p, 0, 4) - u_n.subs(theta, 1).series(p, 0, 4)
>>> CN = u_e.series(p, 0, 4) - u_n.subs(theta, half).series(p, 0, 4)
>>> FE
(1/2)*n*p**2 - 1/2*n**2*p**3 + (1/3)*n*p**3 + O(p**4)
>>> BE
(1/2)*n**2*p**3 - 1/2*n*p**2 + (1/3)*n*p**3 + O(p**4)
>>> CN
(1/12)*n*p**3 + O(p**4)

For a fixed time \(t\), the parameter \(n\) in these expressions increases as \(p\rightarrow 0\) since \(t=n\Delta t =\mbox{const}\) and hence \(n\) must increase like \(\Delta t^{-1}\). With \(n\) substituted by \(t/\Delta t\) in the leading-order error terms, these become \(\frac{1}{2} na^2\Delta t^2 = {\frac{1}{2}}ta^2\Delta t\) for the Forward and Backward Euler scheme, and \(\frac{1}{12}na^3\Delta t^3 = \frac{1}{12}ta^3\Delta t^2\) for the Crank-Nicolson scheme. The global error is therefore of second order (in \(\Delta t\)) for the latter scheme and of first order for the former schemes.

When the global error \(e^n\rightarrow 0\) as \(\Delta t\rightarrow 0\), we say that the scheme is convergent. It means that the numerical solution approaches the exact solution as the mesh is refined, and this is a much desired property of a numerical method.

Integrated errors

It is common to study the norm of the numerical error, as explained in detail in the section Computing the norm of the numerical error. The \(L^2\) norm can be computed by treating \(e^n\) as a function of \(t\) in sympy and performing symbolic integration. For the Forward Euler scheme we have

p, n, a, dt, t, T, theta = symbols('p n a dt t T 'theta')
A = (1-(1-theta)*p)/(1+theta*p)
u_e = exp(-p*n)
u_n = A**n
error = u_e.series(p, 0, 4) - u_n.subs(theta, 0).series(p, 0, 4)
# Introduce t and dt instead of n and p
error = error.subs('n', 't/dt').subs(p, 'a*dt')
error = error.as_leading_term(dt) # study only the first term
print error
error_L2 = sqrt(integrate(error**2, (t, 0, T)))
print error_L2

The output reads

sqrt(30)*sqrt(T**3*a**4*dt**2*(6*T**2*a**2 - 15*T*a + 10))/60

which means that the \(L^2\) error behaves like \(a^2\Delta t\).

Strictly speaking, the numerical error is only defined at the mesh points so it makes most sense to compute the \(\ell^2\) error

\[||e^n||_{\ell^2} = \sqrt{\Delta t\sum_{n=0}^{N_t} ({{u_{\small\mbox{e}}}}(t_n) - u^n)^2} {\thinspace .}\]

We have obtained an exact analytical expressions for the error at \(t=t_n\), but here we use the leading-order error term only since we are mostly interested in how the error behaves as a polynomial in \(\Delta t\), and then the leading order term will dominate. For the Forward Euler scheme, \({u_{\small\mbox{e}}}(t_n) - u^n \approx {\frac{1}{2}}np^2\), and we have

\[||e^n||_{\ell^2}^2 = \Delta t\sum_{n=0}^{N_t} \frac{1}{4}n^2p^4 =\Delta t\frac{1}{4}p^4 \sum_{n=0}^{N_t} n^2{\thinspace .}\]

Now, \(\sum_{n=0}^{N_t} n^2\approx \frac{1}{3}N_t^3\). Using this approximation, setting \(N_t =T/\Delta t\), and taking the square root gives the expression

\[||e^n||_{\ell^2} = \frac{1}{2}\sqrt{\frac{T^3}{3}} a^2\Delta t{\thinspace .}\]

Calculations for the Backward Euler scheme are very similar and provide the same result, while the Crank-Nicolson scheme leads to

\[||e^n||_{\ell^2} = \frac{1}{12}\sqrt{\frac{T^3}{3}}a^3\Delta t^2{\thinspace .}\]

Summary of errors

Both the point-wise and the time-integrated true errors are of second order in \(\Delta t\) for the Crank-Nicolson scheme and of first order in \(\Delta t\) for the Forward Euler and Backward Euler schemes.

Truncation error

The truncation error is a very frequently used error measure for finite difference methods. It is defined as the error in the difference equation that arises when inserting the exact solution. Contrary to many other error measures, e.g., the true error \(e^n={u_{\small\mbox{e}}}(t_n)-u^n\), the truncation error is a quantity that is easily computable.

Let us illustrate the calculation of the truncation error for the Forward Euler scheme. We start with the difference equation on operator form,

\[\lbrack D_t u = -au\rbrack^n,\]

i.e.,

\[\frac{u^{n+1}-u^n}{\Delta t} = -au^n{\thinspace .}\]

The idea is to see how well the exact solution \({u_{\small\mbox{e}}}(t)\) fulfills this equation. Since \({u_{\small\mbox{e}}}(t)\) in general will not obey the discrete equation, error in the discrete equation, called a residual, denoted here by \(R^n\):

(4)\[ R^n = \frac{{u_{\small\mbox{e}}}(t_{n+1})-{u_{\small\mbox{e}}}(t_n)}{\Delta t} + a{u_{\small\mbox{e}}}(t_n) {\thinspace .}\]

The residual is defined at each mesh point and is therefore a mesh function with a superscript \(n\).

The interesting feature of \(R^n\) is to see how it depends on the discretization parameter \(\Delta t\). The tool for reaching this goal is to Taylor expand \({u_{\small\mbox{e}}}\) around the point where the difference equation is supposed to hold, here \(t=t_n\). We have that

\[{u_{\small\mbox{e}}}(t_{n+1}) = {u_{\small\mbox{e}}}(t_n) + {u_{\small\mbox{e}}}'(t_n)\Delta t + \frac{1}{2}{u_{\small\mbox{e}}}''(t_n) \Delta t^2 + \cdots\]

Inserting this Taylor series in (4) gives

\[R^n = {u_{\small\mbox{e}}}'(t_n) + \frac{1}{2}{u_{\small\mbox{e}}}''(t_n)\Delta t + \ldots + a{u_{\small\mbox{e}}}(t_n){\thinspace .}\]

Now, \({u_{\small\mbox{e}}}\) fulfills the ODE \({u_{\small\mbox{e}}}'=-a{u_{\small\mbox{e}}}\) such that the first and last term cancels and we have

\[R^n \approx \frac{1}{2}{u_{\small\mbox{e}}}''(t_n)\Delta t {\thinspace .}\]

This \(R^n\) is the truncation error, which for the Forward Euler is seen to be of first order in \(\Delta t\).

The above procedure can be repeated for the Backward Euler and the Crank-Nicolson schemes. We start with the scheme in operator notation, write it out in detail, Taylor expand \({u_{\small\mbox{e}}}\) around the point \(\tilde t\) at which the difference equation is defined, collect terms that correspond to the ODE (here \({u_{\small\mbox{e}}}' + a{u_{\small\mbox{e}}}\)), and identify the remaining terms as the residual \(R\), which is the truncation error. The Backward Euler scheme leads to

\[R^n \approx -\frac{1}{2}{u_{\small\mbox{e}}}''(t_n)\Delta t,\]

while the Crank-Nicolson scheme gives

\[R^{n+\frac{1}{2}} \approx \frac{1}{24}{u_{\small\mbox{e}}}'''(t_{n+\frac{1}{2}})\Delta t^2{\thinspace .}\]

The order \(r\) of a finite difference scheme is often defined through the leading term \(\Delta t^r\) in the truncation error. The above expressions point out that the Forward and Backward Euler schemes are of first order, while Crank-Nicolson is of second order. We have looked at other error measures in other sections, like the error in amplification factor and the error \(e^n={u_{\small\mbox{e}}}(t_n)-u^n\), and expressed these error measures in terms of \(\Delta t\) to see the order of the method. Normally, calculating the truncation error is more straightforward than deriving the expressions for other error measures and therefore the easiest way to establish the order of a scheme.

Consistency, stability, and convergence

Three fundamental concepts when solving differential equations by numerical methods are consistency, stability, and convergence. We shall briefly touch these concepts below in the context of the present model problem.

Consistency means that the error in the difference equation, measured through the truncation error, goes to zero as \(\Delta t\rightarrow 0\). Since the truncation error tells how well the exact solution fulfills the difference equation, and the exact solution fulfills the differential equation, consistency ensures that the difference equation approaches the differential equation in the limit. The expressions for the truncation errors in the previous section are all proportional to \(\Delta t\) or \(\Delta t^2\), hence they vanish as \(\Delta t\rightarrow 0\), and all the schemes are consistent. Lack of consistency implies that we actually solve a different differential equation in the limit \(\Delta t\rightarrow 0\) than we aim at.

Stability means that the numerical solution exhibits the same qualitative properties as the exact solution. This is obviously a feature we want the numerical solution to have. In the present exponential decay model, the exact solution is monotone and decaying. An increasing numerical solution is not in accordance with the decaying nature of the exact solution and hence unstable. We can also say that an oscillating numerical solution lacks the property of monotonicity of the exact solution and is also unstable. We have seen that the Backward Euler scheme always leads to monotone and decaying solutions, regardless of \(\Delta t\), and is hence stable. The Forward Euler scheme can lead to increasing solutions and oscillating solutions if \(\Delta t\) is too large and is therefore unstable unless \(\Delta t\) is sufficiently small. The Crank-Nicolson can never lead to increasing solutions and has no problem to fulfill that stability property, but it can produce oscillating solutions and is unstable in that sense, unless \(\Delta t\) is sufficiently small.

Convergence implies that the global (true) error mesh function \(e^n = {u_{\small\mbox{e}}}(t_n)-u^n\rightarrow 0\) as \(\Delta t\rightarrow 0\). This is really what we want: the numerical solution gets as close to the exact solution as we request by having a sufficiently fine mesh.

Convergence is hard to establish theoretically, except in quite simple problems like the present one. Stability and consistency are much easier to calculate. A major breakthrough in the understanding of numerical methods for differential equations came in 1956 when Lax and Richtmeyer established equivalence between convergence on one hand and consistency and stability on the other (the Lax equivalence theorem). In practice it meant that one can first establish that a method is stable and consistent, and then it is automatically convergent (which is much harder to establish). The result holds for linear problems only, and in the world of nonlinear differential equations the relations between consistency, stability, and convergence are much more complicated.

We have seen in the previous analysis that the Forward Euler, Backward Euler, and Crank-Nicolson schemes are convergent (\(e^n\rightarrow 0\)), that they are consistent (\(R^n\rightarrow 0\), and that they are stable under certain conditions on the size of \(\Delta t\). We have also derived explicit mathematical expressions for \(e^n\), the truncation error, and the stability criteria.

Exercises (1)

Exercise 15: Visualize the accuracy of finite differences \(u=e^{-at}\)

The purpose of this exercise is to visualize the accuracy of finite difference approximations of the derivative of a given function. For any finite difference approximation, take the Forward Euler difference as an example, and any specific function, take \(u=e^{-at}\), we may introduce an error fraction specific

\[E = \frac{[D_t^+ u]^n}{u'(t_n)} = \frac{\exp{(-a(t_n+\Delta t))} - \exp{(-at_n)}}{-a\exp{(-at_n)}} = -\frac{1}{a\Delta t}\left(\exp{(-a\Delta t)} - 1\right),\]

and view \(E\) as a function of \(\Delta t\). We expect that \(\lim_{\Delta t\rightarrow 0}E=1\), while \(E\) may deviate significantly from unit for large \(\Delta t\). How the error depends on \(\Delta t\) is best visualized in a graph where we use a logarithmic scale on for \(\Delta t\), so we can cover many orders of magnitude of that quantity. Here is a code segment creating an array of 100 intervals, on the logarithmic scale, ranging from \(10^{-6}\) to \(1\) and then plotting \(E\) versus \(p=a\Delta t\) with logarithmic scale on the \(\Delta t\) axis:

from numpy import logspace, exp
from matplotlib.pyplot import plot
p = logspace(-6, 1, 101)
y = -(exp(-p)-1)/p
semilog(p, y)

Illustrate such errors for the finite difference operators \([D_t^+u]^n\) (forward), \([D_t^-u]^n\) (backward), and \([D_t u]^n\) (centered).

Perform a Taylor series expansions of the error fractions and find the leading order \(r\) in the expressions of type \(1 + C\Delta t^r + {\mathcal{O}(\Delta t^{r+1)}}\), where \(C\) is some constant. Filename: decay_plot_fd_exp_error.py.

Exercise 16: Explore the \(\theta\)-rule for exponential growth

This exercise asks you to solve the ODE \(u'=-au\) with \(a<0\) such that the ODE models exponential growth instead of exponential decay. A central theme is to investigate numerical artifacts and non-physical solution behavior.

a) Run experiments with \(\theta\) and \(\Delta t\) to uncover numerical artifacts (the exact solution is a monotone, growing function). Use the insight to design a set of experiments that aims to demonstrate all types of numerical artifacts for different choices of \(\Delta t\) while \(a\) is fixed.

Hint. Modify the decay_exper1.py code to suit your needs.

Filename: growth_exper.py.

b) Write a scientific report about the findings.

Hint. Use examples from the section Making a report to see how scientific reports can be written.

Filenames: growth_exper.pdf, growth_exper.html.

c) Plot the amplification factors for the various schemes together with the exact one for \(a<0\) and use the plot to explain the observations made in the experiments.

Hint. Modify the decay_ampf_plot.py code.

Filename: growth_ampf.py.