Exponential decay ODEs

We shall now compute the truncation error of a finite difference scheme for a differential equation. Our first problem involves the following the linear ODE modeling exponential decay,

u(t)=au(t).

Forward Euler scheme

We begin with the Forward Euler scheme for discretizing (680):

\begin{align}\begin{aligned}\tag{681} \lbrack D_t^+ u = -au \rbrack^n\\    {\thinspace .}\end{aligned}\end{align}

The idea behind the truncation error computation is to insert the exact solution ue of the differential equation problem (680) in the discrete equations (681) and find the residual that arises because ue does not solve the discrete equations. Instead, ue solves the discrete equations with a residual Rn:

\begin{align}\begin{aligned}\tag{682} [D_t^+ {u_{\small\mbox{e}}} + a{u_{\small\mbox{e}}} = R]^n\\    {\thinspace .}\end{aligned}\end{align}

From (664)-(665) it follows that

[D+tue]n=ue(tn)+12ue(tn)Δt+O(Δt2),

which inserted in (682) results in

ue(tn)+12ue(tn)Δt+O(Δt2)+aue(tn)=Rn.

Now, ue(tn)+auen=0 since ue solves the differential equation. The remaining terms constitute the residual:

\begin{align}\begin{aligned}\tag{683} R^n = \frac{1}{2}{u_{\small\mbox{e}}}''(t_n)\Delta t + {\mathcal{O}(\Delta t^2)}\\    {\thinspace .}\end{aligned}\end{align}

This is the truncation error Rn of the Forward Euler scheme.

Because Rn is proportional to Δt, we say that the Forward Euler scheme is of first order in Δt. However, the truncation error is just one error measure, and it is not equal to the true error uenun. For this simple model problem we can compute a range of different error measures for the Forward Euler scheme, including the true error uenun, and all of them have dominating terms proportional to Δt.

Crank-Nicolson scheme

For the Crank-Nicolson scheme,

[Dtu=au]n+12,

we compute the truncation error by inserting the exact solution of the ODE and adding a residual R,

[Dtue+a¯uet=R]n+12.

The term [Dtue]n+12 is easily computed from (658)-(659) by replacing n with n+12 in the formula,

[Dtue]n+12=ue(tn+12)+124ue(tn+12)Δt2+O(Δt4).

The arithmetic mean is related to u(tn+12) by (674)-(675) so

[a¯uet]n+12=ue(tn+12)+18ue(tn)Δt2++O(Δt4).

Inserting these expressions in (685) and observing that ue(tn+12)+auen+12=0, because ue(t) solves the ODE u(t)=au(t) at any point t, we find that

Rn+12=(124ue(tn+12)+18ue(tn))Δt2+O(Δt4)

Here, the truncation error is of second order because the leading term in R is proportional to Δt2.

At this point it is wise to redo some of the computations above to establish the truncation error of the Backward Euler scheme, see Problem B.4: Truncation error of the Backward Euler scheme.

The θ-rule

We may also compute the truncation error of the θ-rule,

[ˉDtu=a¯ut,θ]n+θ.

Our computational task is to find Rn+θ in

[ˉDtue+a¯uet,θ=R]n+θ.

From (666)-(667) and (672)-(673) we get expressions for the terms with ue. Using that ue(tn+θ)+aue(tn+θ)=0, we end up with

Rn+θ=(12θ)ue(tn+θ)Δt+12θ(1θ)ue(tn+θ)Δt2+
12(θ2θ+3)ue(tn+θ)Δt2+O(Δt3)

For θ=12 the first-order term vanishes and the scheme is of second order, while for θ12 we only have a first-order scheme.

Using symbolic software

The previously mentioned truncation_error module can be used to automate the Taylor series expansions and the process of collecting terms. Here is an example on possible use:

from truncation_error import DiffOp
from sympy import *

def decay():
    u, a = symbols('u a')
    diffop = DiffOp(u, independent_variable='t',
                    num_terms_Taylor_series=3)
    D1u = diffop.D(1)   # symbol for du/dt
    ODE = D1u + a*u     # define ODE

    # Define schemes
    FE = diffop['Dtp'] + a*u
    CN = diffop['Dt' ] + a*u
    BE = diffop['Dtm'] + a*u
    theta = diffop['barDt'] + a*diffop['weighted_arithmetic_mean']
    theta = sm.simplify(sm.expand(theta))
    # Residuals (truncation errors)
    R = {'FE': FE-ODE, 'BE': BE-ODE, 'CN': CN-ODE,
         'theta': theta-ODE}
    return R

The returned dictionary becomes

decay: {
 'BE': D2u*dt/2 + D3u*dt**2/6,
 'FE': -D2u*dt/2 + D3u*dt**2/6,
 'CN': D3u*dt**2/24,
 'theta': -D2u*a*dt**2*theta**2/2 + D2u*a*dt**2*theta/2 -
           D2u*dt*theta + D2u*dt/2 + D3u*a*dt**3*theta**3/3 -
           D3u*a*dt**3*theta**2/2 + D3u*a*dt**3*theta/6 +
           D3u*dt**2*theta**2/2 - D3u*dt**2*theta/2 + D3u*dt**2/6,
}

The results are in correspondence with our hand-derived expressions.

Empirical verification of the truncation error

The task of this section is to demonstrate how we can compute the truncation error R numerically. For example, the truncation error of the Forward Euler scheme applied to the decay ODE u=ua is

\begin{align}\begin{aligned}\tag{688} R^n = [D_t^+{u_{\small\mbox{e}}} + a{u_{\small\mbox{e}}}]^n\\    {\thinspace .}\end{aligned}\end{align}

If we happen to know the exact solution ue(t), we can easily evaluate Rn from the above formula.

To estimate how R varies with the discretization parameter Δt, which has been our focus in the previous mathematical derivations, we first make the assumption that R=CΔtr for appropriate constants C and r and small enough Δt. The rate r can be estimated from a series of experiments where Δt is varied. Suppose we have m experiments (Δti,Ri), i=0,,m1. For two consecutive experiments (Δti1,Ri1) and (Δti,Ri), a corresponding ri1 can be estimated by

ri1=ln(Ri1/Ri)ln(Δti1/Δti),

for i=1,,m1. Note that the truncation error Ri varies through the mesh, so (689) is to be applied pointwise. A complicating issue is that Ri and Ri1 refer to different meshes. Pointwise comparisons of the truncation error at a certain point in all meshes therefore requires any computed R to be restricted to the coarsest mesh and that all finer meshes contain all the points in the coarsest mesh. Suppose we have N0 intervals in the coarsest mesh. Inserting a superscript n in (689), where n counts mesh points in the coarsest mesh, n=0,,N0, leads to the formula

rni1=ln(Rni1/Rni)ln(Δti1/Δti).

Experiments are most conveniently defined by N0 and a number of refinements m. Suppose each mesh has twice as many cells Ni as the previous one:

Ni=2iN0,Δti=TN1i,

where [0,T] is the total time interval for the computations. Suppose the computed Ri values on the mesh with Ni intervals are stored in an array R[i] (R being a list of arrays, one for each mesh). Restricting this Ri function to the coarsest mesh means extracting every Ni/N0 point and is done as follows:

stride = N[i]/N_0
R[i] = R[i][::stride]

The quantity R[i][n] now corresponds to Rni.

In addition to estimating r for the pointwise values of R=CΔtr, we may also consider an integrated quantity on mesh i,

RI,i=(ΔtiNin=0(Rni)2)12T0Ri(t)dt.

The sequence RI,i, i=0,,m1, is also expected to behave as CΔtr, with the same r as for the pointwise quantity R, as Δt0.

The function below computes the Ri and RI,i quantities, plots them and compares with the theoretically derived truncation error (R_a) if available.

import numpy as np
import scitools.std as plt

def estimate(truncation_error, T, N_0, m, makeplot=True):
    """
    Compute the truncation error in a problem with one independent
    variable, using m meshes, and estimate the convergence
    rate of the truncation error.

    The user-supplied function truncation_error(dt, N) computes
    the truncation error on a uniform mesh with N intervals of
    length dt::

      R, t, R_a = truncation_error(dt, N)

    where R holds the truncation error at points in the array t,
    and R_a are the corresponding theoretical truncation error
    values (None if not available).

    The truncation_error function is run on a series of meshes
    with 2**i*N_0 intervals, i=0,1,...,m-1.
    The values of R and R_a are restricted to the coarsest mesh.
    and based on these data, the convergence rate of R (pointwise)
    and time-integrated R can be estimated empirically.
    """
    N = [2**i*N_0 for i in range(m)]

    R_I = np.zeros(m) # time-integrated R values on various meshes
    R   = [None]*m    # time series of R restricted to coarsest mesh
    R_a = [None]*m    # time series of R_a restricted to coarsest mesh
    dt = np.zeros(m)
    legends_R = [];  legends_R_a = []  # all legends of curves

    for i in range(m):
        dt[i] = T/float(N[i])
        R[i], t, R_a[i] = truncation_error(dt[i], N[i])

        R_I[i] = np.sqrt(dt[i]*np.sum(R[i]**2))

        if i == 0:
            t_coarse = t           # the coarsest mesh

        stride = N[i]/N_0
        R[i] = R[i][::stride]      # restrict to coarsest mesh
        R_a[i] = R_a[i][::stride]

        if makeplot:
            plt.figure(1)
            plt.plot(t_coarse, R[i], log='y')
            legends_R.append('N=%d' % N[i])
            plt.hold('on')

            plt.figure(2)
            plt.plot(t_coarse, R_a[i] - R[i], log='y')
            plt.hold('on')
            legends_R_a.append('N=%d' % N[i])

    if makeplot:
        plt.figure(1)
        plt.xlabel('time')
        plt.ylabel('pointwise truncation error')
        plt.legend(legends_R)
        plt.savefig('R_series.png')
        plt.savefig('R_series.pdf')
        plt.figure(2)
        plt.xlabel('time')
        plt.ylabel('pointwise error in estimated truncation error')
        plt.legend(legends_R_a)
        plt.savefig('R_error.png')
        plt.savefig('R_error.pdf')

    # Convergence rates
    r_R_I = convergence_rates(dt, R_I)
    print 'R integrated in time; r:',
    print ' '.join(['%.1f' % r for r in r_R_I])
    R = np.array(R)  # two-dim. numpy array
    r_R = [convergence_rates(dt, R[:,n])[-1]
           for n in range(len(t_coarse))]

The first makeplot block demonstrates how to build up two figures in parallel, using plt.figure(i) to create and switch to figure number i. Figure numbers start at 1. A logarithmic scale is used on the y axis since we expect that R as a function of time (or mesh points) is exponential. The reason is that the theoretical estimate (683) contains ue, which for the present model goes like eat. Taking the logarithm makes a straight line.

The code follows closely the previously stated mathematical formulas, but the statements for computing the convergence rates might deserve an explanation. The generic help function convergence_rate(h, E) computes and returns ri1, i=1,,m1 from (690), given Δti in h and Rni in E:

def convergence_rates(h, E):
    from math import log
    r = [log(E[i]/E[i-1])/log(h[i]/h[i-1])
         for i in range(1, len(h))]
    return r

Calling r_R_I = convergence_rates(dt, R_I) computes the sequence of rates r0,r1,,rm2 for the model RIΔtr, while the statements

R = np.array(R)  # two-dim. numpy array
r_R = [convergence_rates(dt, R[:,n])[-1]
       for n in range(len(t_coarse))]

compute the final rate rm2 for RnΔtr at each mesh point tn in the coarsest mesh. This latter computation deserves more explanation. Since R[i][n] holds the estimated truncation error Rni on mesh i, at point tn in the coarsest mesh, R[:,n] picks out the sequence Rni for i=0,,m1. The convergence_rate function computes the rates at tn, and by indexing [-1] on the returned array from convergence_rate, we pick the rate rm2, which we believe is the best estimation since it is based on the two finest meshes.

The estimate function is available in a module trunc_empir.py. Let us apply this function to estimate the truncation error of the Forward Euler scheme. We need a function decay_FE(dt, N) that can compute (688) at the points in a mesh with time step dt and N intervals:

import numpy as np
import trunc_empir

def decay_FE(dt, N):
    dt = float(dt)
    t = np.linspace(0, N*dt, N+1)
    u_e = I*np.exp(-a*t)  # exact solution, I and a are global
    u = u_e  # naming convention when writing up the scheme
    R = np.zeros(N)

    for n in range(0, N):
        R[n] = (u[n+1] - u[n])/dt + a*u[n]

    # Theoretical expression for the trunction error
    R_a = 0.5*I*(-a)**2*np.exp(-a*t)*dt

    return R, t[:-1], R_a[:-1]

if __name__ == '__main__':
    I = 1; a = 2  # global variables needed in decay_FE
    trunc_empir.estimate(decay_FE, T=2.5, N_0=6, m=4, makeplot=True)

The estimated rates for the integrated truncation error RI become 1.1, 1.0, and 1.0 for this sequence of four meshes. All the rates for Rn, computed as r_R, are also very close to 1 at all mesh points. The agreement between the theoretical formula (683) and the computed quantity (ref:ref:(688) <Eq:trunc:decay:FE:R:comp>) is very good, as illustrated in Figures Estimated truncation error at mesh points for different meshes and Difference between theoretical and estimated truncation error at mesh points for different meshes. The program trunc_decay_FE.py was used to perform the simulations and it can easily be modified to test other schemes (see also Exercise B.5: Empirical estimation of truncation errors).

_images/R_series.png

Estimated truncation error at mesh points for different meshes

_images/R_error.png

Difference between theoretical and estimated truncation error at mesh points for different meshes

Increasing the accuracy by adding correction terms

Now we ask the question: can we add terms in the differential equation that can help increase the order of the truncation error? To be precise, let us revisit the Forward Euler scheme for u=au, insert the exact solution ue, include a residual R, but also include new terms C:

[D+tue+aue=C+R]n.

Inserting the Taylor expansions for [D+tue]n and keeping terms up to 3rd order in Δt gives the equation

12ue(tn)Δt16ue(tn)Δt2+124ue(tn)Δt3+O(Δt4)=Cn+Rn.

Can we find Cn such that Rn is O(Δt2)? Yes, by setting

Cn=12ue(tn)Δt,

we manage to cancel the first-order term and

Rn=16ue(tn)Δt2+O(Δt3).

The correction term Cn introduces 12Δtu in the discrete equation, and we have to get rid of the derivative u. One idea is to approximate u by a second-order accurate finite difference formula, u(un+12un+un1)/Δt2, but this introduces an additional time level with un1. Another approach is to rewrite u in terms of u or u using the ODE:

u=auu=au=a(au)=a2u.

This means that we can simply set Cn=12a2Δtun. We can then either solve the discrete equation

[D+tu=au+12a2Δtu]n,

or we can equivalently discretize the perturbed ODE

u=ˆau,ˆa=a(112aΔt),

by a Forward Euler method. That is, we replace the original coefficient a by the perturbed coefficient ˆa. Observe that ˆaa as Δt0.

The Forward Euler method applied to (694) results in

[D+tu=a(112aΔt)u]n.

We can control our computations and verify that the truncation error of the scheme above is indeed O(Δt2).

Another way of revealing the fact that the perturbed ODE leads to a more accurate solution is to look at the amplification factor. Our scheme can be written as

un+1=Aun,A=1ˆaΔt=1p+12p2,p=aΔt,

The amplification factor A as a function of p=aΔt is seen to be the first three terms of the Taylor series for the exact amplification factor ep. The Forward Euler scheme for u=au gives only the first two terms 1p of the Taylor series for ep. That is, using ˆa increases the order of the accuracy in the amplification factor.

Instead of replacing u by a2u, we use the relation u=au and add a term 12aΔtu in the ODE:

u=au12aΔtu(1+12aΔt)u=au.

Using a Forward Euler method results in

(1+12aΔt)un+1unΔt=aun,

which after some algebra can be written as

un+1=112aΔt1+12aΔtun.

This is the same formula as the one arising from a Crank-Nicolson scheme applied to u=au! It now recommended to do Exercise B.6: Correction term for a Backward Euler scheme and repeat the above steps to see what kind of correction term is needed in the Backward Euler scheme to make it second order.

The Crank-Nicolson scheme is a bit more challenging to analyze, but the ideas and techniques are the same. The discrete equation reads

[Dtu=au]n+12,

and the truncation error is defined through

[Dtue+a¯uet=C+R]n+12,

where we have added a correction term. We need to Taylor expand both the discrete derivative and the arithmetic mean with aid of (658)-(659) and (674)-(675), respectively. The result is

124ue(tn+12)Δt2+O(Δt4)+a8ue(tn+12)Δt2+O(Δt4)=Cn+12+Rn+12.

The goal now is to make Cn+12 cancel the Δt2 terms:

Cn+12=124ue(tn+12)Δt2+a8ue(tn)Δt2.

Using u=au, we have that u=a2u, and we find that u=a3u. We can therefore solve the perturbed ODE problem

u=ˆau,ˆa=a(1112a2Δt2),

by the Crank-Nicolson scheme and obtain a method that is of fourth order in Δt. Problem B.7: Verify the effect of correction terms encourages you to implement these correction terms and calculate empirical convergence rates to verify that higher-order accuracy is indeed obtained in real computations.

Extension to variable coefficients

Let us address the decay ODE with variable coefficients,

u(t)=a(t)u(t)+b(t),

discretized by the Forward Euler scheme,

[D+tu=au+b]n.

The truncation error R is as always found by inserting the exact solution ue(t) in the discrete scheme:

[D+tue+aueb=R]n.

Using (664)-(665),

ue(tn)12ue(tn)Δt+O(Δt2)+a(tn)ue(tn)b(tn)=Rn.

Because of the ODE,

ue(tn)+a(tn)ue(tn)b(tn)=0,

so we are left with the result

Rn=12ue(tn)Δt+O(Δt2) .

We see that the variable coefficients do not pose any additional difficulties in this case. Problem B.8: Truncation error of the Crank-Nicolson scheme takes the analysis above one step further to the Crank-Nicolson scheme.

Exact solutions of the finite difference equations

Having a mathematical expression for the numerical solution is very valuable in program verification since we then know the exact numbers that the program should produce. Looking at the various formulas for the truncation errors in (658)-(659) and (678)-(679) in the section Overview of leading-order error terms in finite difference formulas, we see that all but two of the R expressions contain a second or higher order derivative of ue. The exceptions are the geometric and harmonic means where the truncation error involves ue and even ue in case of the harmonic mean. So, apart from these two means, choosing ue to be a linear function of t, ue=ct+d for constants c and d, will make the truncation error vanish since ue=0. Consequently, the truncation error of a finite difference scheme will be zero since the various approximations used will all be exact. This means that the linear solution is an exact solution of the discrete equations.

In a particular differential equation problem, the reasoning above can be used to determine if we expect a linear ue to fulfill the discrete equations. To actually prove that this is true, we can either compute the truncation error and see that it vanishes, or we can simply insert ue(t)=ct+d in the scheme and see that it fulfills the equations. The latter method is usually the simplest. It will often be necessary to add some source term to the ODE in order to allow a linear solution.

Many ODEs are discretized by centered differences. From the section Overview of leading-order error terms in finite difference formulas we see that all the centered difference formulas have truncation errors involving ue or higher-order derivatives. A quadratic solution, e.g., ue(t)=t2+ct+d, will then make the truncation errors vanish. This observation can be used to test if a quadratic solution will fulfill the discrete equations. Note that a quadratic solution will not obey the equations for a Crank-Nicolson scheme for u=au+b because the approximation applies an arithmetic mean, which involves a truncation error with ue.

Computing truncation errors in nonlinear problems

The general nonlinear ODE

u=f(u,t),

can be solved by a Crank-Nicolson scheme

[Dtu=¯ft]n+12.

The truncation error is as always defined as the residual arising when inserting the exact solution ue in the scheme:

[Dtue¯ft=R]n+12.

Using (674)-(675) for ¯ft results in

[¯ft]n+12=12(f(uen,tn)+f(uen+1,tn+1))=f(uen+12,tn+12)+18ue(tn+12)Δt2+O(Δt4).

With (658)-(659) the discrete equations (700) lead to

ue(tn+12)+124ue(tn+12)Δt2f(uen+12,tn+12)18ue(tn+12)Δt2+O(Δt4)=Rn+12.

Since ue(tn+12)f(uen+12,tn+12)=0, the truncation error becomes

Rn+12=(124ue(tn+12)18ue(tn+12))Δt2.

The computational techniques worked well even for this nonlinear ODE.