The derivation of \(F_i\) and \(J_{i,j}\) in the 1D model problem is easily generalized to multi-dimensional problems. For example, Backward Euler discretization of the PDE
gives the nonlinear time-discrete PDEs
or with \(u^n\) simply as \(u\) and \(u^{n-1}\) as \(u_1\),
The variational form, assuming homogeneous Neumann conditions for simplicity, becomes
The nonlinear algebraic equations follow from setting \(v={\psi}_i\) and using the representation \(u=\sum_kc_k{\psi}_k\), which we just write as
Picard iteration needs a linearization where we use the most recent approximation \(u_{-}\) to \(u\) in \({\alpha}\) and \(f\):
The equations \(\hat F_i=0\) are now linear and we can easily derive a linear system for the unknown coefficients \(\left\{ {c}_i \right\}_{i\in{\mathcal{I}_s}}\) by inserting \(u=\sum_jc_j{\psi}_j\).
In Newton’s method we need to evaluate \(F_i\) with the known value \(u_{-}\) for \(u\):
The Jacobian is obtained by differentiating (3) and using \(\partial u/\partial c_j={\psi}_j\):
The evaluation of \(J_{i,j}\) as the coefficient matrix in the linear system in Newton’s method applies the known approximation \(u_{-}\) for \(u\):
Hopefully, these example also show how convenient the notation with \(u\) and \(u_{-}\) is: the unknown to be computed is always \(u\) and linearization by inserting known (previously computed) values is a matter of adding an underscore. One can take great advantage of this quick notation in software [Ref2].
A natural physical flux condition for the PDE (1) takes the form of a non-homogeneous Neumann condition
where \(g\) is a prescribed function and \(\partial\Omega_N\) is a part of the boundary of the domain \(\Omega\). From integrating \(\int_\Omega\nabla\cdot({\alpha}\nabla u){\, \mathrm{d}x}\) by parts, we get a boundary term
Inserting the condition (8) into this term results in an integral over prescribed values: \(-\int_{\partial\Omega_N}gv{\, \mathrm{d}s}\). The nonlinearity in the \({\alpha}(u)\) coefficient condition (8) therefore does not contribute with a nonlinearity in the variational form.
Heat conduction problems often apply a kind of Newton’s cooling law, also known as a Robin condition, at the boundary:
where \(h_T(u)\) is a heat transfer coefficient between the body (\(\Omega\)) and its surroundings, \(T_s\) is the temperature of the surroundings, and \(\partial\Omega_R\) is a part of the boundary where this Robin condition applies. The boundary integral (9) now becomes
by replacing \({\alpha}(u)\partial u/\partial u\) by \(h_T(u-T_s)\). Often, \(h_T(u)\) can be taken as constant, and then the boundary term is linear in \(u\), otherwise it is nonlinear and contributes to the Jacobian in a Newton method. Linearization in a Picard method will typically use a known value in \(h_T\), but keep the \(u\) in \(u-T_s\) as unknown: \(h_T(u_{-})(u-T_s(t))\).
A typical diffusion equation
can be discretized by (e.g.) a Backward Euler scheme, which in 2D can be written
We do not dive into details of boundary conditions now. Dirichlet and Neumann conditions are handled as in linear diffusion problems.
Writing the scheme out, putting the unknown values on the left-hand side and known values on the right-hand side, and introducing \(\Delta x=\Delta y=h\) to save some writing, one gets
This defines a nonlinear algebraic system \(A(u)u=b(u)\). A Picard iteration applies old values \(u_{-}\) in \({\alpha}\) and \(f\), or equivalently, old values for \(u\) in \(A(u)\) and \(b(u)\). The result is a linear system of the same type as those arising from \(u_t = \nabla\cdot ({\alpha}(\boldsymbol{x})\nabla u) + f(\boldsymbol{x},t)\).
Newton’s method is as usual more involved. We first define the nonlinear algebraic equations to be solved, drop the superscript \(n\), and introduce \(u_1\) for \(u^{n-1}\):
It is convenient to work with two indices \(i\) and \(j\) in 2D finite difference discretizations, but it complicates the derivation of the Jacobian, which then gets four indices. The left-hand expression of an equation \(F_{i,j}=0\) is to be differentiated with respect to each of the unknowns \(u_{r,s}\) (short for \(u_{r,s}^n\)), \(r\in{\mathcal{I}_x}\), \(s\in{\mathcal{I}_y}\),
Given \(i\) and \(j\), only a few \(r\) and \(s\) indices give nonzero contribution since \(F_{i,j}\) contains \(u_{i\pm 1,j}\), \(u_{i,j\pm 1}\), and \(u_{i,j}\). Therefore, \(J_{i,j,r,s}\) is very sparse and we can set up the left-hand side of the Newton system as
The specific derivatives become
The \(J_{i,j,i,j}\) entry has a few more terms. Inserting \(u_{-}\) for \(u\) in the \(J\) formula and then forming \(J\delta u=-F\) gives the linear system to be solved in each Newton iteration.
Picard iteration or Newton’s method may diverge when solving PDEs with severe nonlinearities. Relaxation with \(\omega <1\) may help, but in highly nonlinear problems it can be necessary to introduce a continuation parameter \(\Lambda\) in the problem: \(\Lambda =0\) gives a version of the problem that is easy to solve, while \(\Lambda =1\) is the target problem. The idea is then to increase \(\Lambda\) in steps, \(\Lambda_0=0 ,\Lambda_1 <\cdots <\Lambda_n=1\), and use the solution from the problem with \(\Lambda_{i-1}\) as initial guess for the iterations in the problem corresponding to \(\Lambda_i\).
The continuation method is easiest to understand through an example. Suppose we intend to solve
which is an equation modeling the flow of a non-Newtonian fluid through i channel or pipe. For \(q=0\) we have the Poisson equation (corresponding to a Newtonian fluid) and the problem is linear. A typical value for pseudo-plastic fluids may be \(q_n=-0.8\). We can introduce the continuation parameter \(\Lambda\in [0,1]\) such that \(q=q_n\Lambda\). Let \(\{\Lambda_\ell\}_{\ell=0}^n\) be the sequence of \(\Lambda\) values in \([0,1]\), with corresponding \(q\) values \(\{q_\ell\}_{\ell=0}^n\). We can then solve a sequence of problems
where the initial guess for iterating on \(u^{\ell}\) is the previously computed solution \(u^{\ell-1}\). If a particular \(\Lambda_\ell\) leads to convergence problems, one may try a smaller increase in \(\Lambda\): \(\Lambda_* = \frac{1}{2} (\Lambda_{\ell-1}+\Lambda_\ell)\), and repeat halving the step in \(\Lambda\) until convergence is reestablished.