Linearization at the differential equation level

The attention is now turned to nonlinear partial differential equations (PDEs) and application of the techniques explained above for ODEs. The model problem is a nonlinear diffusion equation for \(u(\boldsymbol{x},t)\):

\[\tag{549} \frac{\partial u}{\partial t} = \nabla\cdot ({\alpha}(u)\nabla u) + f(u),\quad \boldsymbol{x}\in\Omega,\ t\in (0,T],\]
\[\tag{550} -{\alpha}(u)\frac{\partial u}{\partial n} = g,\quad \boldsymbol{x}\in\partial\Omega_N,\ t\in (0,T],\]
\[\tag{551} u = u_0,\quad \boldsymbol{x}\in\partial\Omega_D,\ t\in (0,T]{\thinspace .}\]

In the present section, our aim is to discretize this problem in time and then present techniques for linearizing the time-discrete PDE problem “at the PDE level” such that we transform the nonlinear stationary PDE problem at each time level into a sequence of linear PDE problems, which can be solved using any method for linear PDEs. This strategy avoids the solution of systems of nonlinear algebraic equations. In the section 1D stationary nonlinear differential equations we shall take the opposite (and more common) approach: discretize the nonlinear problem in time and space first, and then solve the resulting nonlinear algebraic equations at each time level by the methods of the section Systems of nonlinear algebraic equations. Very often, the two approaches are mathematically identical, so there is no preference from a computational efficiency point of view. The details of the ideas sketched above will hopefully become clear through the forthcoming examples.

Explicit time integration

The nonlinearities in the PDE are trivial to deal with if we choose an explicit time integration method for (549), such as the Forward Euler method:

\[[D_t^+ u = \nabla\cdot ({\alpha}(u)\nabla u) + f(u)]^n,\]

or written out,

\[\frac{u^{n+1} - u^n}{\Delta t} = \nabla\cdot ({\alpha}(u^n)\nabla u^n) + f(u^n),\]

which is a linear equation in the unknown \(u^{n+1}\) with solution

\[u^{n+1} = u^n + \Delta t\nabla\cdot ({\alpha}(u^n)\nabla u^n) + \Delta t f(u^n){\thinspace .}\]

The disadvantage with this discretization is the strict stability criterion \(\Delta t \leq h^2/(6\max\alpha)\) for the case \(f=0\) and a standard 2nd-order finite difference discretization in 3D space with mesh cell sizes \(h=\Delta x=\Delta y=\Delta z\).

Backward Euler scheme and Picard iteration

A Backward Euler scheme for (549) reads

\[[D_t^- u = \nabla\cdot ({\alpha}(u)\nabla u) + f(u)]^n{\thinspace .}\]

Written out,

\[\tag{552} \frac{u^{n} - u^{n-1}}{\Delta t} = \nabla\cdot ({\alpha}(u^n)\nabla u^n) + f(u^n){\thinspace .}\]

This is a nonlinear PDE for the unknown function \(u^n(\boldsymbol{x})\). Such a PDE can be viewed as a time-independent PDE where \(u^{n-1}(\boldsymbol{x})\) is a known function.

We introduce a Picard iteration with \(k\) as iteration counter. A typical linearization of the \(\nabla\cdot({\alpha}(u^n)\nabla u^n)\) term in iteration \(k+1\) is to use the previously computed \(u^{n,k}\) approximation in the diffusion coefficient: \({\alpha}(u^{n,k})\). The nonlinear source term is treated similarly: \(f(u^{n,k})\). The unknown function \(u^{n,k+1}\) then fulfills the linear PDE

\[\tag{553} \frac{u^{n,k+1} - u^{n-1}}{\Delta t} = \nabla\cdot ({\alpha}(u^{n,k}) \nabla u^{n,k+1}) + f(u^{n,k}){\thinspace .}\]

The initial guess for the Picard iteration at this time level can be taken as the solution at the previous time level: \(u^{n,0}=u^{n-1}\).

We can alternatively apply the implementation-friendly notation where \(u\) corresponds to the unknown we want to solve for, i.e., \(u^{n,k+1}\) above, and \(u^{-}\) is the most recently computed value, \(u^{n,k}\) above. Moreover, \(u^{(1)}\) denotes the unknown function at the previous time level, \(u^{n-1}\) above. The PDE to be solved in a Picard iteration then looks like

\[\tag{554} \frac{u - u^{(1)}}{\Delta t} = \nabla\cdot ({\alpha}(u^{-}) \nabla u) + f(u^{-}){\thinspace .}\]

At the beginning of the iteration we start with the value from the previous time level: \(u^{-}=u^{(1)}\), and after each iteration, \(u^{-}\) is updated to \(u\).

Remark on notation

The previous derivations of the numerical scheme for time discretizations of PDEs have, strictly speaking, a somewhat sloppy notation, but it is much used and convenient to read. A more precise notation must distinguish clearly between the exact solution of the PDE problem, here denoted \({u_{\small\mbox{e}}}(\boldsymbol{x},t)\), and the exact solution of the spatial problem, arising after time discretization at each time level, where (552) is an example. The latter is here represented as \(u^n(\boldsymbol{x})\) and is an approximation to \({u_{\small\mbox{e}}}(\boldsymbol{x},t_n)\). Then we have another approximation \(u^{n,k}(\boldsymbol{x})\) to \(u^n(\boldsymbol{x})\) when solving the nonlinear PDE problem for \(u^n\) by iteration methods, as in (553).

In our notation, \(u\) is a synonym for \(u^{n,k+1}\) and \(u^{(1)}\) is a synonym for \(u^{n-1}\), inspired by what are natural variable names in a code. We will usually state the PDE problem in terms of \(u\) and quickly redefine the symbol \(u\) to mean the numerical approximation, while \({u_{\small\mbox{e}}}\) is not explicitly introduced unless we need to talk about the exact solution and the approximate solution at the same time.

Backward Euler scheme and Newton’s method

At time level \(n\), we have to solve the stationary PDE (552). In the previous section, we saw how this can be done with Picard iterations. Another alternative is to apply the idea of Newton’s method in a clever way. Normally, Newton’s method is defined for systems of algebraic equations, but the idea of the method can be applied at the PDE level too.

Linearization via Taylor expansions

Let \(u^{n,k}\) be an approximation to the unknown \(u^n\). We seek a better approximation on the form

\[\tag{555} u^{n} = u^{n,k} + \delta u{\thinspace .}\]

The idea is to insert (555) in (552), Taylor expand the nonlinearities and keep only the terms that are linear in \(\delta u\) (which makes (555) an approximation for \(u^{n}\)). Then we can solve a linear PDE for the correction \(\delta u\) and use (555) to find a new approximation

\[u^{n,k+1}=u^{n,k}+\delta u\]

to \(u^{n}\). Repeating this procedure gives a sequence \(u^{n,k+1}\), \(k=0,1,\ldots\) that hopefully converges to the goal \(u^n\).

Let us carry out all the mathematical details for the nonlinear diffusion PDE discretized by the Backward Euler method. Inserting (555) in (552) gives

\[\tag{556} \frac{u^{n,k} +\delta u - u^{n-1}}{\Delta t} = \nabla\cdot ({\alpha}(u^{n,k} + \delta u)\nabla (u^{n,k}+\delta u)) + f(u^{n,k}+\delta u){\thinspace .}\]

We can Taylor expand \({\alpha}(u^{n,k} + \delta u)\) and \(f(u^{n,k}+\delta u)\):

\[\begin{split}\begin{align*} {\alpha}(u^{n,k} + \delta u) & = {\alpha}(u^{n,k}) + \frac{d{\alpha}}{du}(u^{n,k}) \delta u + {\mathcal{O}(\delta u^2)}\approx {\alpha}(u^{n,k}) + {\alpha}^{\prime}(u^{n,k})\delta u,\\ f(u^{n,k}+\delta u) &= f(u^{n,k}) + \frac{df}{du}(u^{n,k})\delta u + {\mathcal{O}(\delta u^2)}\approx f(u^{n,k}) + f^{\prime}(u^{n,k})\delta u{\thinspace .} \end{align*}\end{split}\]

Inserting the linear approximations of \({\alpha}\) and \(f\) in (556) results in

\[\frac{u^{n,k} +\delta u - u^{n-1}}{\Delta t} = \nabla\cdot ({\alpha}(u^{n,k})\nabla u^{n,k}) + f(u^{n,k}) + \nonumber\]
\[\qquad \nabla\cdot ({\alpha}(u^{n,k})\nabla \delta u) + \nabla\cdot ({\alpha}^{\prime}(u^{n,k})\delta u\nabla u^{n,k}) + \nonumber\]
\[\tag{557} \qquad \nabla\cdot ({\alpha}^{\prime}(u^{n,k})\delta u\nabla \delta u) + f^{\prime}(u^{n,k})\delta u{\thinspace .}\]

The term \({\alpha}^{\prime}(u^{n,k})\delta u\nabla \delta u\) is of order \(\delta u^2\) and therefore omitted since we expect the correction \(\delta u\) to be small (\(\delta u \gg \delta u^2\)). Reorganizing the equation gives a PDE for \(\delta u\) that we can write in short form as

\[\delta F(\delta u; u^{n,k}) = -F(u^{n,k}),\]

where

\[\tag{558} F(u^{n,k}) = \frac{u^{n,k} - u^{n-1}}{\Delta t} - \nabla\cdot ({\alpha}(u^{n,k})\nabla u^{n,k}) + f(u^{n,k}),\]
\[\delta F(\delta u; u^{n,k}) = - \frac{1}{\Delta t}\delta u + \nabla\cdot ({\alpha}(u^{n,k})\nabla \delta u) + \nonumber\]
\[\tag{559} \quad \nabla\cdot ({\alpha}^{\prime}(u^{n,k})\delta u\nabla u^{n,k}) + f^{\prime}(u^{n,k})\delta u{\thinspace .}\]

Note that \(\delta F\) is a linear function of \(\delta u\), and \(F\) contains only terms that are known, such that the PDE for \(\delta u\) is indeed linear.

Observations

The notational form \(\delta F = -F\) resembles the Newton system \(J\delta u =-F\) for systems of algebraic equations, with \(\delta F\) as \(J\delta u\). The unknown vector in a linear system of algebraic equations enters the system as a linear operator in terms of a matrix-vector product (\(J\delta u\)), while at the PDE level we have a linear differential operator instead (\(\delta F\)).

Similarity with Picard iteration

We can rewrite the PDE for \(\delta u\) in a slightly different way too if we define \(u^{n,k} + \delta u\) as \(u^{n,k+1}\).

\[ \frac{u^{n,k+1} - u^{n-1}}{\Delta t} = \nabla\cdot ({\alpha}(u^{n,k})\nabla u^{n,k+1}) + f(u^{n,k})\nonumber\]
\[\tag{560} \qquad + \nabla\cdot ({\alpha}^{\prime}(u^{n,k})\delta u\nabla u^{n,k}) + f^{\prime}(u^{n,k})\delta u{\thinspace .}\]

Note that the first line is the same PDE as arises in the Picard iteration, while the remaining terms arise from the differentiations that are an inherent ingredient in Newton’s method.

Implementation

For coding we want to introduce \(u\) for \(u^n\), \(u^{-}\) for \(u^{n,k}\) and \(u^{(1)}\) for \(u^{n-1}\). The formulas for \(F\) and \(\delta F\) are then more clearly written as

\[\tag{561} F(u^{-}) = \frac{u^{-} - u^{(1)}}{\Delta t} - \nabla\cdot ({\alpha}(u^{-})\nabla u^{-}) + f(u^{-}),\]
\[\delta F(\delta u; u^{-}) = - \frac{1}{\Delta t}\delta u + \nabla\cdot ({\alpha}(u^{-})\nabla \delta u) + \nonumber\]
\[\tag{562} \quad \nabla\cdot ({\alpha}^{\prime}(u^{-})\delta u\nabla u^{-}) + f^{\prime}(u^{-})\delta u{\thinspace .}\]

The form that orders the PDE as the Picard iteration terms plus the Newton method’s derivative terms becomes

\[ \frac{u - u^{(1)}}{\Delta t} = \nabla\cdot ({\alpha}(u^{-})\nabla u) + f(u^{-}) + \nonumber\]
\[\tag{563} \qquad \gamma(\nabla\cdot ({\alpha}^{\prime}(u^{-})(u - u^{-})\nabla u^{-}) + f^{\prime}(u^{-})(u - u^{-})){\thinspace .}\]

The Picard and full Newton versions correspond to \(\gamma=0\) and \(\gamma=1\), respectively.

Derivation with alternative notation

Some may prefer to derive the linearized PDE for \(\delta u\) using the more compact notation. We start with inserting \(u^n=u^{-}+\delta u\) to get

\[\frac{u^{-} +\delta u - u^{n-1}}{\Delta t} = \nabla\cdot ({\alpha}(u^{-} + \delta u)\nabla (u^{-}+\delta u)) + f(u^{-}+\delta u){\thinspace .}\]

Taylor expanding,

\[\begin{split}\begin{align*} {\alpha}(u^{-} + \delta u) & \approx {\alpha}(u^{-}) + {\alpha}^{\prime}(u^{-})\delta u,\\ f(u^{-}+\delta u) & \approx f(u^{-}) + f^{\prime}(u^{-})\delta u, \end{align*}\end{split}\]

and inserting these expressions gives a less cluttered PDE for \(\delta u\):

\[\begin{split}\begin{align*} \frac{u^{-} +\delta u - u^{n-1}}{\Delta t} &= \nabla\cdot ({\alpha}(u^{-})\nabla u^{-}) + f(u^{-}) + \\ &\qquad \nabla\cdot ({\alpha}(u^{-})\nabla \delta u) + \nabla\cdot ({\alpha}^{\prime}(u^{-})\delta u\nabla u^{-}) + \\ &\qquad \nabla\cdot ({\alpha}^{\prime}(u^{-})\delta u\nabla \delta u) + f^{\prime}(u^{-})\delta u{\thinspace .} \end{align*}\end{split}\]

Crank-Nicolson discretization

A Crank-Nicolson discretization of (549) applies a centered difference at \(t_{n+\frac{1}{2}}\):

\[[D_t u = \nabla\cdot ({\alpha}(u)\nabla u) + f(u)]^{n+\frac{1}{2}}{\thinspace .}\]

The standard technique is to apply an arithmetic average for quantities defined between two mesh points, e.g.,

\[u^{n+\frac{1}{2}}\approx \frac{1}{2}(u^n + u^{n+1}){\thinspace .}\]

However, with nonlinear terms we have many choices of formulating an arithmetic mean:

\[\tag{564} [f(u)]^{n+\frac{1}{2}} \approx f(\frac{1}{2}(u^n + u^{n+1})) = [f(\overline{u}^t)]^{n+\frac{1}{2}},\]
\[\tag{565} [f(u)]^{n+\frac{1}{2}} \approx \frac{1}{2}(f(u^n) + f(u^{n+1})) =[\overline{f(u)}^t]^{n+\frac{1}{2}},\]
\[\tag{566} [{\alpha}(u)\nabla u]^{n+\frac{1}{2}} \approx {\alpha}(\frac{1}{2}(u^n + u^{n+1}))\nabla (\frac{1}{2}(u^n + u^{n+1})) = [{\alpha}(\overline{u}^t)\nabla \overline{u}^t]^{n+\frac{1}{2}},\]
\[\tag{567} [{\alpha}(u)\nabla u]^{n+\frac{1}{2}} \approx \frac{1}{2}({\alpha}(u^n) + {\alpha}(u^{n+1}))\nabla (\frac{1}{2}(u^n + u^{n+1})) = [\overline{{\alpha}(u)}^t\nabla\overline{u}^t]^{n+\frac{1}{2}},\]
\[\tag{568} [{\alpha}(u)\nabla u]^{n+\frac{1}{2}} \approx \frac{1}{2}({\alpha}(u^n)\nabla u^n + {\alpha}(u^{n+1})\nabla u^{n+1}) = [\overline{{\alpha}(u)\nabla u}^t]^{n+\frac{1}{2}}{\thinspace .}\]

A big question is whether there are significant differences in accuracy between taking the products of arithmetic means or taking the arithmetic mean of products. Exercise 5.6: Find the truncation error of arithmetic mean of products investigates this question, and the answer is that the approximation is \({\mathcal{O}(\Delta t^2)}\) in both cases.