$$ \newcommand{\uex}{{u_{\small\mbox{e}}}} \newcommand{\uexd}[1]{{u_{\small\mbox{e}, #1}}} \newcommand{\vex}{{v_{\small\mbox{e}}}} \newcommand{\vexd}[1]{{v_{\small\mbox{e}, #1}}} \newcommand{\Aex}{{A_{\small\mbox{e}}}} \newcommand{\half}{\frac{1}{2}} \newcommand{\halfi}{{1/2}} \newcommand{\tp}{\thinspace .} \newcommand{\Ddt}[1]{\frac{D #1}{dt}} \newcommand{\E}[1]{\hbox{E}\lbrack #1 \rbrack} \newcommand{\Var}[1]{\hbox{Var}\lbrack #1 \rbrack} \newcommand{\Std}[1]{\hbox{Std}\lbrack #1 \rbrack} \newcommand{\xpoint}{\boldsymbol{x}} \newcommand{\normalvec}{\boldsymbol{n}} \newcommand{\Oof}[1]{\mathcal{O}(#1)} \newcommand{\x}{\boldsymbol{x}} \newcommand{\X}{\boldsymbol{X}} \renewcommand{\u}{\boldsymbol{u}} \renewcommand{\v}{\boldsymbol{v}} \newcommand{\w}{\boldsymbol{w}} \newcommand{\V}{\boldsymbol{V}} \newcommand{\e}{\boldsymbol{e}} \newcommand{\f}{\boldsymbol{f}} \newcommand{\F}{\boldsymbol{F}} \newcommand{\stress}{\boldsymbol{\sigma}} \newcommand{\strain}{\boldsymbol{\varepsilon}} \newcommand{\stressc}{{\sigma}} \newcommand{\strainc}{{\varepsilon}} \newcommand{\I}{\boldsymbol{I}} \newcommand{\T}{\boldsymbol{T}} \newcommand{\dfc}{\alpha} % diffusion coefficient \newcommand{\ii}{\boldsymbol{i}} \newcommand{\jj}{\boldsymbol{j}} \newcommand{\kk}{\boldsymbol{k}} \newcommand{\ir}{\boldsymbol{i}_r} \newcommand{\ith}{\boldsymbol{i}_{\theta}} \newcommand{\iz}{\boldsymbol{i}_z} \newcommand{\Ix}{\mathcal{I}_x} \newcommand{\Iy}{\mathcal{I}_y} \newcommand{\Iz}{\mathcal{I}_z} \newcommand{\It}{\mathcal{I}_t} \newcommand{\If}{\mathcal{I}_s} % for FEM \newcommand{\Ifd}{{I_d}} % for FEM \newcommand{\Ifb}{{I_b}} % for FEM \newcommand{\setb}[1]{#1^0} % set begin \newcommand{\sete}[1]{#1^{-1}} % set end \newcommand{\setl}[1]{#1^-} \newcommand{\setr}[1]{#1^+} \newcommand{\seti}[1]{#1^i} \newcommand{\sequencei}[1]{\left\{ {#1}_i \right\}_{i\in\If}} \newcommand{\basphi}{\varphi} \newcommand{\baspsi}{\psi} \newcommand{\refphi}{\tilde\basphi} \newcommand{\psib}{\boldsymbol{\psi}} \newcommand{\sinL}[1]{\sin\left((#1+1)\pi\frac{x}{L}\right)} \newcommand{\xno}[1]{x_{#1}} \newcommand{\Xno}[1]{X_{(#1)}} \newcommand{\yno}[1]{y_{#1}} \newcommand{\Yno}[1]{Y_{(#1)}} \newcommand{\xdno}[1]{\boldsymbol{x}_{#1}} \newcommand{\dX}{\, \mathrm{d}X} \newcommand{\dx}{\, \mathrm{d}x} \newcommand{\ds}{\, \mathrm{d}s} \newcommand{\Real}{\mathbb{R}} \newcommand{\Integerp}{\mathbb{N}} \newcommand{\Integer}{\mathbb{Z}} $$

Multi-dimensional PDE problems

Finite element discretization

The derivation of $ F_i $ and $ J_{i,j} $ in the 1D model problem is easily generalized to multi-dimensional problems. For example, Backward Euler discretization of the PDE

$$ \begin{equation} u_t = \nabla\cdot(\dfc(u)\nabla u) + f(u), \tag{40} \end{equation} $$ gives the nonlinear time-discrete PDEs

$$ u^n - \Delta t\nabla\cdot(\dfc(u^n)\nabla u^n) + f(u^n) = u^{n-1},$$ or with $ u^n $ simply as $ u $ and $ u^{n-1} $ as $ u_1 $,

$$ u - \Delta t\nabla\cdot(\dfc(u^n)\nabla u) - \Delta t f(u) = u_1\tp$$ The variational form, assuming homogeneous Neumann conditions for simplicity, becomes

$$ \begin{equation} \int_\Omega (uv + \Delta t\dfc(u)\nabla u\cdot\nabla v - \Delta t f(u)v - u_1v)\dx\tp \tag{41} \end{equation} $$ The nonlinear algebraic equations follow from setting $ v=\baspsi_i $ and using the representation $ u=\sum_kc_k\baspsi_k $, which we just write as

$$ \begin{equation} F_i = \int_\Omega (u\baspsi_i + \Delta t\dfc(u)\nabla u\cdot\nabla \baspsi_i - \Delta t f(u)\baspsi_i - u_1\baspsi_i)\dx\tp \tag{42} \end{equation} $$ Picard iteration needs a linearization where we use the most recent approximation $ u_{-} $ to $ u $ in $ \dfc $ and $ f $:

$$ \begin{equation} F_i \approx \hat F_i = \int_\Omega (u\baspsi_i + \Delta t\dfc(u_{-})\nabla u\cdot\nabla \baspsi_i - \Delta t f(u_{-})\baspsi_i - u_1\baspsi_i)\dx\tp \tag{43} \end{equation} $$ The equations $ \hat F_i=0 $ are now linear and we can easily derive a linear system for the unknown coefficients $ \sequencei{c} $ by inserting $ u=\sum_jc_j\baspsi_j $.

In Newton's method we need to evaluate $ F_i $ with the known value $ u_{-} $ for $ u $:

$$ \begin{equation} F_i \approx \hat F_i = \int_\Omega (u_{-}\baspsi_i + \Delta t\dfc(u_{-}) \nabla u_{-}\cdot\nabla \baspsi_i - \Delta t f(u_{-})\baspsi_i - u_1\baspsi_i)\dx\tp \tag{44} \end{equation} $$ The Jacobian is obtained by differentiating (42) and using $ \partial u/\partial c_j=\baspsi_j $:

$$ \begin{align} J_{i,j} = \frac{\partial F_i}{\partial c_j} = \int_\Omega & (\baspsi_j\baspsi_i + \Delta t\dfc'(u)\baspsi_j \nabla u\cdot\nabla \baspsi_i + \Delta t\dfc(u)\nabla\baspsi_j\cdot\nabla\baspsi_i - \nonumber\\ &\ \Delta t f'(u)\baspsi_j\baspsi_i)\dx\tp \tag{46} \end{align} $$ The evaluation of $ J_{i,j} $ as the coefficient matrix in the linear system in Newton's method applies the known approximation $ u_{-} $ for $ u $:

$$ \begin{align} J_{i,j} = \int_\Omega & (\baspsi_j\baspsi_i + \Delta t\dfc'(u_{-})\baspsi_j \nabla u_{-}\cdot\nabla \baspsi_i + \Delta t\dfc(u_{-})\nabla\baspsi_j\cdot\nabla\baspsi_i - \nonumber\\ &\ \Delta t f'(u_{-})\baspsi_j\baspsi_i)\dx\tp \tag{46} \end{align} $$ Hopefully, these example also show how convenient the notation with $ u $ and $ u_{-} $ is: the unknown to be computed is always $ u $ and linearization by inserting known (previously computed) values is a matter of adding an underscore. One can take great advantage of this quick notation in software [2].

Non-homogeneous Neumann conditions

A natural physical flux condition for the PDE (40) takes the form of a non-homogeneous Neumann condition

$$ \begin{equation} -\dfc(u)\frac{\partial u}{\partial n} = g,\quad\x\in\partial\Omega_N, \tag{47} \end{equation} $$ where $ g $ is a prescribed function and $ \partial\Omega_N $ is a part of the boundary of the domain $ \Omega $. From integrating $ \int_\Omega\nabla\cdot(\dfc\nabla u)\dx $ by parts, we get a boundary term

$$ \begin{equation} \int_{\partial\Omega_N}\dfc(u)\frac{\partial u}{\partial u}v\ds\tp \tag{48} \end{equation} $$ Inserting the condition (47) into this term results in an integral over prescribed values: $ -\int_{\partial\Omega_N}gv\ds $. The nonlinearity in the $ \dfc(u) $ coefficient condition (47) therefore does not contribute with a nonlinearity in the variational form.

Robin conditions

Heat conduction problems often apply a kind of Newton's cooling law, also known as a Robin condition, at the boundary:

$$ \begin{equation} -\dfc(u)\frac{\partial u}{\partial u} = h_T(u)(u-T_s(t)),\quad\x\in\partial\Omega_R, \tag{49} \end{equation} $$ where $ h_T(u) $ is a heat transfer coefficient between the body ($ \Omega $) and its surroundings, $ T_s $ is the temperature of the surroundings, and $ \partial\Omega_R $ is a part of the boundary where this Robin condition applies. The boundary integral (48) now becomes

$$ \int_{\partial\Omega_R}h_T(u)(u-T_s(T))v\ds,$$ by replacing $ \dfc(u)\partial u/\partial u $ by $ h_T(u-T_s) $. Often, $ h_T(u) $ can be taken as constant, and then the boundary term is linear in $ u $, otherwise it is nonlinear and contributes to the Jacobian in a Newton method. Linearization in a Picard method will typically use a known value in $ h_T $, but keep the $ u $ in $ u-T_s $ as unknown: $ h_T(u_{-})(u-T_s(t)) $.

Finite difference discretization

A typical diffusion equation

$$ u_t = \nabla\cdot(\dfc(u)\nabla u) + f(u),$$ can be discretized by (e.g.) a Backward Euler scheme, which in 2D can be written

$$ [D_t^- u = D_x\overline{\dfc}^xD_x u + D_y\overline{\dfc}^yD_y u + f(u)]_{i,j}^n\tp $$ We do not dive into details of boundary conditions now. Dirichlet and Neumann conditions are handled as in linear diffusion problems.

Writing the scheme out, putting the unknown values on the left-hand side and known values on the right-hand side, and introducing $ \Delta x=\Delta y=h $ to save some writing, one gets

$$ \begin{align*} u^n_{i,j} &- \frac{\Delta t}{h^2}( \half(\dfc(u_{i,j}^n) + \dfc(u_{i+1,j}^n))(u_{i+1,j}^n-u_{i,j}^n)\\ &\quad - \half(\dfc(u_{i-1,j}^n) + \dfc(u_{i,j}^n))(u_{i,j}^n-u_{i-1,j}^n) \\ &\quad + \half(\dfc(u_{i,j}^n) + \dfc(u_{i,j+1}^n))(u_{i,j+1}^n-u_{i,j}^n)\\ &\quad - \half(\dfc(u_{i,j-1}^n) + \dfc(u_{i,j}^n))(u_{i,j}^n-u_{i-1,j-1}^n)) - \Delta tf(u_{i,j}^n) = u^{n-1}_{i,j} \end{align*} $$ This defines a nonlinear algebraic system $ A(u)u=b(u) $. A Picard iteration applies old values $ u_{-} $ in $ \dfc $ and $ f $, or equivalently, old values for $ u $ in $ A(u) $ and $ b(u) $. The result is a linear system of the same type as those arising from $ u_t = \nabla\cdot (\dfc(\x)\nabla u) + f(\x,t) $.

Newton's method is as usual more involved. We first define the nonlinear algebraic equations to be solved, drop the superscript $ n $, and introduce $ u_1 $ for $ u^{n-1} $:

$$ \begin{align*} F_{i,j} &= u^n_{i,j} - \frac{\Delta t}{h^2}(\\ &\quad \half(\dfc(u_{i,j}^n) + \dfc(u_{i+1,j}^n))(u_{i+1,j}^n-u_{i,j}^n) - \half(\dfc(u_{i-1,j}^n) + \dfc(u_{i,j}^n))(u_{i,j}^n-u_{i-1,j}^n) + \\ &\quad \half(\dfc(u_{i,j}^n) + \dfc(u_{i,j+1}^n))(u_{i,j+1}^n-u_{i,j}^n) - \half(\dfc(u_{i,j-1}^n) + \dfc(u_{i,j}^n))(u_{i,j}^n-u_{i-1,j-1}^n)) -\\ &\quad \Delta tf(u_{i,j}^n) - u^{n-1}_{i,j} = 0\tp \end{align*} $$ It is convenient to work with two indices $ i $ and $ j $ in 2D finite difference discretizations, but it complicates the derivation of the Jacobian, which then gets four indices. The left-hand expression of an equation $ F_{i,j}=0 $ is to be differentiated with respect to each of the unknowns $ u_{r,s} $ (short for $ u_{r,s}^n $), $ r\in\Ix $, $ s\in\Iy $,

$$ J_{i,j,r,s} = \frac{\partial F_{i,j}}{\partial u_{r,s}}\tp $$ Given $ i $ and $ j $, only a few $ r $ and $ s $ indices give nonzero contribution since $ F_{i,j} $ contains $ u_{i\pm 1,j} $, $ u_{i,j\pm 1} $, and $ u_{i,j} $. Therefore, $ J_{i,j,r,s} $ is very sparse and we can set up the left-hand side of the Newton system as

$$ \begin{align*} J_{i,j,r,s}\delta u_{r,s} = J_{i,j,i,j}\delta u_{i,j} & + J_{i,j,i-1,j}\delta u_{i-1,j} + J_{i,j,i+1,j}\delta u_{i+1,j} + J_{i,j,i,j-1}\delta u_{i,j-1}\\ & + J_{i,j,i,j+1}\delta u_{i,j+1} \end{align*} $$ The specific derivatives become

$$ \begin{align*} J_{i,j,i-1,j} &= \frac{\partial F_{i,j}}{\partial u_{i-1,j}}\\ &= \frac{\Delta t}{h^2}(\dfc'(u_{i-1,j})(u_{i,j}-u_{i-1,j}) + \dfc(u_{i-1,j})(-1))\\ J_{i,j,i+1,j} &= \frac{\partial F_{i,j}}{\partial u_{i+1,j}}\\ &= \frac{\Delta t}{h^2}(-\dfc'(u_{i+1,j})(u_{i+1,j}-u_{i,j}) - \dfc(u_{i-1,j}))\\ J_{i,j,i,j-1} &= \frac{\partial F_{i,j}}{\partial u_{i,j-1}}\\ &= \frac{\Delta t}{h^2}(\dfc'(u_{i,j-1})(u_{i,j}-u_{i,j-1}) + \dfc(u_{i,j-1})(-1))\\ J_{i,j,i,j+1} &= \frac{\partial F_{i,j}}{\partial u_{i,j+1}}\\ &= \frac{\Delta t}{h^2}(-\dfc'(u_{i,j+1})(u_{i,j+1}-u_{i,j}) - \dfc(u_{i,j-1})) \end{align*} $$ The $ J_{i,j,i,j} $ entry has a few more terms. Inserting $ u_{-} $ for $ u $ in the $ J $ formula and then forming $ J\delta u=-F $ gives the linear system to be solved in each Newton iteration.

Continuation methods

Picard iteration or Newton's method may diverge when solving PDEs with severe nonlinearities. Relaxation with $ \omega <1 $ may help, but in highly nonlinear problems it can be necessary to introduce a continuation parameter $ \Lambda $ in the problem: $ \Lambda =0 $ gives a version of the problem that is easy to solve, while $ \Lambda =1 $ is the target problem. The idea is then to increase $ \Lambda $ in steps, $ \Lambda_0=0 ,\Lambda_1 <\cdots <\Lambda_n=1 $, and use the solution from the problem with $ \Lambda_{i-1} $ as initial guess for the iterations in the problem corresponding to $ \Lambda_i $.

The continuation method is easiest to understand through an example. Suppose we intend to solve

$$ -\nabla\cdot\left( ||\nabla u||^q\nabla u\right) = f, $$ which is an equation modeling the flow of a non-Newtonian fluid through i channel or pipe. For $ q=0 $ we have the Poisson equation (corresponding to a Newtonian fluid) and the problem is linear. A typical value for pseudo-plastic fluids may be $ q_n=-0.8 $. We can introduce the continuation parameter $ \Lambda\in [0,1] $ such that $ q=q_n\Lambda $. Let $ \{\Lambda_\ell\}_{\ell=0}^n $ be the sequence of $ \Lambda $ values in $ [0,1] $, with corresponding $ q $ values $ \{q_\ell\}_{\ell=0}^n $. We can then solve a sequence of problems

$$ -\nabla\cdot\left( ||\nabla u||^q_\ell\nabla u^\ell\right) = f,\quad \ell = 0,\ldots,n,$$ where the initial guess for iterating on $ u^{\ell} $ is the previously computed solution $ u^{\ell-1} $. If a particular $ \Lambda_\ell $ leads to convergence problems, one may try a smaller increase in $ \Lambda $: $ \Lambda_* = \half (\Lambda_{\ell-1}+\Lambda_\ell) $, and repeat halving the step in $ \Lambda $ until convergence is reestablished.