The section Linearization at the differential equation level presents methods for linearizing time-discrete PDEs directly prior to discretization in space. We can alternatively carry out the discretization in space and of the time-discrete nonlinear PDE problem and get a system of nonlinear algebraic equations, which can be solved by Picard iteration or Newton's method as presented in the section Systems of nonlinear algebraic equations. This latter approach will now be described in detail.
We shall work with the 1D problem $$ \begin{equation} -(\dfc(u)u^{\prime})^{\prime} + au = f(u),\quad x\in (0,L), \quad \dfc(u(0))u^{\prime}(0) = C,\ u(L)=D \tp \tag{28} \end{equation} $$
The problem (28) arises from the stationary limit of a diffusion equation, $$ \begin{equation} \frac{\partial u}{\partial t} = \frac{\partial}{\partial x}\left( \alpha(u)\frac{\partial u}{\partial x}\right) + au + f(u), \tag{29} \end{equation} $$ as \( t\rightarrow\infty \) and \( \partial u/\partial t\rightarrow 0 \). Alternatively, the problem (28) arises at each time level from implicit time discretization of (29). For example, a Backward Euler scheme for (29) with \( a=0 \) leads to $$ \begin{equation} \frac{u^{n}-u^{n-1}}{\Delta t} = \frac{\partial}{\partial x}\left( \alpha(u^n)\frac{\partial u^n}{\partial x}\right) - f(u^n)\tp \tag{30} \end{equation} $$ Introducing \( u(x) \) for \( u^n(x) \), \( u^{(1)} \) for \( u^{n-1} \), and letting \( f(u) \) in (28) be \( f(u) + u^{n-1}/\Delta t \) in (30), gives (28) with \( a=1/\Delta t \).
The nonlinearity in the differential equation (28) poses no more difficulty than a variable coefficient, as in \( (\dfc(x)u^{\prime})^{\prime} \). We can therefore use a standard approach to discretizing the Laplace term with a variable coefficient: $$ [-D_x\dfc D_x u +au = f]_i\tp$$ Writing this out for a uniform mesh with points \( x_i=i\Delta x \), \( i=0,\ldots,N_x \), leads to $$ \begin{align} -\frac{1}{\Delta x^2} \left(\dfc_{i+\half}(u_{i+1}-u_i) - \dfc_{i-\half}(u_{i}-u_{i-1})\right) + au_i &= f(u_i)\tp \tag{31} \end{align} $$ This equation is valid at all the mesh points \( i=0,1,\ldots,N_x-1 \). At \( i=N_x \) we have the Dirichlet condition \( u_i=0 \). The only difference from the case with \( (\dfc(x)u^{\prime})^{\prime} \) and \( f(x) \) is that now \( \dfc \) and \( f \) are functions of \( u \) and not only on \( x \): \( (\dfc(u(x))u^{\prime})^{\prime} \) and \( f(u(x)) \).
The quantity \( \dfc_{i+\half} \), evaluated between two mesh points, needs a comment. Since \( \dfc \) depends on \( u \) and \( u \) is only known at the mesh points, we need to express \( \dfc_{i+\half} \) in terms of \( u_i \) and \( u_{i+1} \). For this purpose we use an arithmetic mean, although a harmonic mean is also common in this context if \( \dfc \) features large jumps. There are two choices of arithmetic means: $$ \begin{align} \dfc_{i+\half} &\approx \dfc(\half(u_i + u_{i+1}) = [\dfc(\overline{u}^x)]^{i+\half}, \tag{32} \\ \dfc_{i+\half} &\approx \half(\dfc(u_i) + \dfc(u_{i+1})) = [\overline{\dfc(u)}^x]^{i+\half} \tag{33} \end{align} $$ Equation (31) with the latter approximation then looks like $$ \begin{align} &-\frac{1}{2\Delta x^2} \left((\dfc(u_i)+\dfc(u_{i+1}))(u_{i+1}-u_i) - (\dfc(u_{i-1})+\dfc(u_{i}))(u_{i}-u_{i-1})\right)\nonumber\\ &\qquad\qquad + au_i = f(u_i), \tag{34} \end{align} $$ or written more compactly, $$ [-D_x\overline{\dfc}^x D_x u +au = f]_i\tp$$
At mesh point \( i=0 \) we have the boundary condition \( \dfc(u)u^{\prime}=C \), which is discretized by $$ [\dfc(u)D_{2x}u = C]_0,$$ meaning $$ \begin{equation} \dfc(u_0)\frac{u_{1} - u_{-1}}{2\Delta x} = C\tp \tag{35} \end{equation} $$ The fictitious value \( u_{-1} \) can be eliminated with the aid of (34) for \( i=0 \). Formally, (34) should be solved with respect to \( u_{i-1} \) and that value (for \( i=0 \)) should be inserted in (35), but it is algebraically much easier to do it the other way around. Alternatively, one can use a ghost cell \( [-\Delta x,0] \) and update the \( u_{-1} \) value in the ghost cell according to (35) after every Picard or Newton iteration. Such an approach means that we use a known \( u_{-1} \) value in (34) from the previous iteration.
The nonlinear algebraic equations (34) are of the form \( A(u)u = b(u) \) with $$ \begin{align*} A_{i,i} &= \frac{1}{2\Delta x^2}(\dfc(u_{i-1}) + 2\dfc(u_{i}) \dfc(u_{i+1})) + a,\\ A_{i,i-1} &= -\frac{1}{2\Delta x^2}(\dfc(u_{i-1}) + \dfc(u_{i})),\\ A_{i,i+1} &= -\frac{1}{2\Delta x^2}(\dfc(u_{i}) + \dfc(u_{i+1})),\\ b_i &= f(u_i)\tp \end{align*} $$ The matrix \( A(u) \) is tridiagonal: \( A_{i,j}=0 \) for \( j > 1+1 \) and \( j < i-1 \).
The above expressions are valid for internal mesh points \( 1\leq i\leq N_x-1 \). For \( i=0 \) we need to express \( u_{i-1}=u_{-1} \) in terms of \( u_1 \) using (35): $$ \begin{equation} u_{-1} = u_1 -\frac{2\Delta x}{\dfc(u_0)}C\tp \tag{36} \end{equation} $$ This value must be inserted in \( A_{0,0} \). The expression for \( A_{i,i+1} \) applies for \( i=0 \), and \( A_{i,i-1} \) does not enter the system when \( i=0 \).
Regarding the last equation, its form depends on whether we include the Dirichlet condition \( u(L)=D \), meaning \( u_{N_x}=D \), in the nonlinear algebraic equation system or not. Suppose we choose \( (u_0,u_1,\ldots,u_{N_x-1}) \) as unknowns, later referred to as systems without Dirichlet conditions. The last equation corresponds to \( i=N_x-1 \). It involves the boundary value \( u_{N_x} \), which is substituted by \( D \). If the unknown vector includes the boundary value, \( (u_0,u_1,\ldots,u_{N_x}) \), later referred to as system including Dirichlet conditions, the equation for \( i=N_x-1 \) just involves the unknown \( u_{N_x} \), and the final equation becomes \( u_{N_x}=D \), corresponding to \( A_{i,i}=1 \) and \( b_i=D \) for \( i=N_x \).
The obvious Picard iteration scheme is to use previously computed values of \( u_i \) in \( A(u) \) and \( b(u) \), as described more in detail in the section Systems of nonlinear algebraic equations. With the notation \( u^{-} \) for the most recently computed value of \( u \), we have the system \( F(u)\approx \hat F(u) = A(u^{-})u - b(u^{-}) \), with \( F=(F_0,F_1,\ldots,F_m) \), \( u=(u_0,u_1,\ldots,u_m) \). The index \( m \) is \( N_x \) if the system includes the Dirichlet condition as a separate equation and \( N_x-1 \) otherwise. The matrix \( A(u^{-}) \) is tridiagonal, so the solution procedure is to fill a tridiagonal matrix data structure and the right-hand side vector with the right numbers and call a Gaussian elimination routine for tridiagonal linear systems.
To write out all the mathematical details in a specific case, let us look at the case \( N_x=2 \). We use \( u^{-}_i \) for the \( i \)-th component in \( u^{-} \). In case we omit the Dirichlet condition from the system we get the following \( 2\times 2 \) system, $$ \left(\begin{array}{cc} A_{0,0}& A_{0,1}\\ A_{1,0} & A_{1,1} \end{array}\right) \left(\begin{array}{c} u_0\\ u_1 \end{array}\right) = \left(\begin{array}{c} b_0\\ b_1 \end{array}\right) $$ The matrix and right-hand side entries are given by $$ \begin{align} A_{0,0} &= \frac{1}{2\Delta x^2}(\dfc(u_{-1}^{-}) + 2\dfc(u_{0}^{-}) + \dfc(u_{1}^{-})) + a\\ A_{0,1} &= -\frac{1}{2\Delta x^2}(\dfc(u_{0}^{-}) + \dfc(u_{1}^{-})),\\ A_{1,0} &= -\frac{1}{2\Delta x^2}(\dfc(u_{0}^{-}) + \dfc(u_{1}^{-})),\\ A_{1,1} &= \frac{1}{2\Delta x^2}(\dfc(u_{0}^{-}) + 2\dfc(u_{1}^{-}) +\dfc(u_{2})) + a,\\ b_0 &= f(u_0^{-}),\\ b_1 &= f(u_1^{-}), \end{align} $$ where \( u_{-1} \) must be substituted by (36), and \( u_2 \) by \( D \).
The system with the Dirichlet condition becomes $$ \left(\begin{array}{ccc} A_{0,0}& A_{0,1} & A_{0,2}\\ A_{1,0} & A_{1,1} & A_{1,2}\\ A_{2,0} & A_{2,1} & A_{2,2} \end{array}\right) \left(\begin{array}{c} u_0\\ u_1\\ u_2 \end{array}\right) = \left(\begin{array}{c} b_0\\ b_1\\ b_2 \end{array}\right), $$ with entries for \( A_{i,j} \) and \( b_i \) as above for \( i,j=1,2 \), keeping \( u_2 \) as unknown in \( A_{1,1} \), and $$ \begin{equation} A_{0,2}=A_{2,0}=A_{2,1}=0,\ A_{1,2}= -\frac{1}{2\Delta x^2}(\dfc(u_{1}) + \dfc(u_{2})),\ A_{2,2}=1,\ b_2=D\tp \end{equation} $$
The Jacobian must be derived in order to use Newton's method. Here it means that we need to differentiate \( F(u)=A(u)u - b(u) \) with respect to the unknown parameters \( u_0,u_1,\ldots,u_m \) (\( m=N_x \) or \( m=N_x-1 \), depending on whether the Dirichlet condition is included in the nonlinear system \( F(u)=0 \) or not). Nonlinear equation number \( i \) has the structure $$ F_i = A_{i,i-1}(u_{i-1},u_i)u_{i-1} + A_{i,i}(u_{i-1},u_i,u_{i+1})u_i + A_{i,i+1}(u_i, u_{i+1})u_{i+1} - b_i(u_i)\tp$$ Computing the Jacobian requires careful differentiation. For example, $$ \begin{align*} \frac{\partial}{\partial u_i}(A_{i,i}(u_{i-1},u_i,u_{i+1})u_i) &= \frac{\partial A_{i,i}}{\partial u_i}u_i + A_{i,i} \frac{\partial u_i}{\partial u_i}\\ &= \frac{\partial}{\partial u_i}( \frac{1}{2\Delta x^2}(\dfc(u_{i-1}) + 2\dfc(u_{i}) +\dfc(u_{i+1})) + a)u_i +\\ &\quad\frac{1}{2\Delta x^2}(\dfc(u_{i-1}) + 2\dfc(u_{i}) +\dfc(u_{i+1})) + a\\ &= \frac{1}{2\Delta x^2}(2\dfc^\prime (u_i)u_i +\dfc(u_{i-1}) + 2\dfc(u_{i}) +\dfc(u_{i+1})) + a\tp \end{align*} $$ The complete Jacobian becomes $$ \begin{align*} J_{i,i} &= \frac{\partial F_i}{\partial u_i} = \frac{\partial A_{i,i-1}}{\partial u_i}u_{i-1} + \frac{\partial A_{i,i}}{\partial u_i}u_i + A_{i,i} + \frac{\partial A_{i,i+1}}{\partial u_i}u_{i+1} - \frac{\partial b_i}{\partial u_{i}}\\ &= \frac{1}{2\Delta x^2}( -\dfc^{\prime}(u_i)u_{i-1} +2\dfc^{\prime}(u_i)u_{i} +\dfc(u_{i-1}) + 2\dfc(u_i) + \dfc(u_{i+1})) +\\ &\quad a -\frac{1}{2\Delta x^2}\dfc^{\prime}(u_{i})u_{i+1} - b^{\prime}(u_i),\\ J_{i,i-1} &= \frac{\partial F_i}{\partial u_{i-1}} = \frac{\partial A_{i,i-1}}{\partial u_{i-1}}u_{i-1} + A_{i-1,i} + \frac{\partial A_{i,i}}{\partial u_{i-1}}u_i - \frac{\partial b_i}{\partial u_{i-1}}\\ &= \frac{1}{2\Delta x^2}( -\dfc^{\prime}(u_{i-1})u_{i-1} - (\dfc(u_{i-1}) + \dfc(u_i)) + \dfc^{\prime}(u_{i-1})u_i),\\ J_{i,i+1} &= \frac{\partial A_{i,i+1}}{\partial u_{i-1}}u_{i+1} + A_{i+1,i} + \frac{\partial A_{i,i}}{\partial u_{i+1}}u_i - \frac{\partial b_i}{\partial u_{i+1}}\\ &=\frac{1}{2\Delta x^2}( -\dfc^{\prime}(u_{i+1})u_{i+1} - (\dfc(u_{i}) + \dfc(u_{i+1})) + \dfc^{\prime}(u_{i+1})u_i) \tp \end{align*} $$ The explicit expression for nonlinear equation number \( i \), \( F_i(u_0,u_1,\ldots) \), arises from moving the \( (u_i) \) term in (34) to the left-hand side: $$ \begin{align} F_i &= -\frac{1}{2\Delta x^2} \left((\dfc(u_i)+\dfc(u_{i+1}))(u_{i+1}-u_i) - (\dfc(u_{i-1})+\dfc(u_{i}))(u_{i}-u_{i-1})\right)\nonumber\\ &\qquad\qquad + au_i - f(u_i) = 0\tp \tag{37} \end{align} $$
At the boundary point \( i=0 \), \( u_{-1} \) must be replaced using the formula (36). When the Dirichlet condition at \( i=N_x \) is not a part of the equation system, the last equation \( F_m=0 \) for \( m=N_x-1 \) involves the quantity \( u_{N_x-1} \) which must be replaced by \( D \). If \( u_{N_x} \) is treated as an unknown in the system, the last equation \( F_m=0 \) has \( m=N_x \) and reads $$ F_{N_x}(u_0,\ldots,u_{N_x}) = u_{N_x} - D = 0\tp$$ Similar replacement of \( u_{-1} \) and \( u_{N_x} \) must be done in the Jacobian for the first and last row. When \( u_{N_x} \) is included as an unknown, the last row in the Jacobian must help implement the condition \( \delta u_{N_x}=0 \), since we assume that \( u \) contains the right Dirichlet value at the beginning of the iteration (\( u_{N_x}=D \)), and then the Newton update should be zero for \( i=0 \), i.e., \( \delta u_{N_x}=0 \). This also forces the right-hand side to be \( b_i=0 \), \( i=N_x \).
We have seen, and can see from the present example, that the linear system in Newton's method contains all the terms present in the system that arises in the Picard iteration method. The extra terms in Newton's method can be multiplied by a factor such that it is easy to program one linear system and set this factor to 0 or 1 to generate the Picard or Newton system.
For a Galerkin-type discretization, which may be developed into a finite element method, we first need to derive the variational problem. Let \( V \) be an appropriate function space with basis functions \( \sequencei{\baspsi} \). Because of the Dirichlet condition at \( x=L \) we require \( \baspsi_i(L)=0 \), \( i\in\If \). The approximate solution is written as \( u = D + \sum_{j\in\If}c_j\baspsi_j \), where the term \( D \) can be viewed as a boundary function needed to implement the Dirichlet condition \( u(L)=D \).
Using Galerkin's method, we multiply the differential equation by any \( v\in V \) and integrate terms with second-order derivatives by parts: $$ \int_0^L \dfc(u)u^{\prime}v^{\prime}\dx + \int_0^L auv\dx = \int_0^L f(u)v\dx + [\dfc(u)u^{\prime}v]_0^L,\quad \forall v\in V\tp $$ The Neumann condition at the boundary \( x=0 \) is inserted in the boundary term: $$ [\dfc(u)u^{\prime}v]_0^L = \dfc(u(L))u^{\prime}(L)v(L) - \dfc(u(0))u^{\prime}(0)v(0) = 0 - Cv(0)=-Cv(0)\tp $$ (Recall that since \( \baspsi_i(L)=0 \), any linear combination \( v \) of the basis functions also vanishes at \( x=L \): \( v(L)=0 \).) The variational problem is then: find \( u\in V \) such that $$ \begin{equation} \int_0^L \dfc(u)u^{\prime}v^{\prime}\dx + \int_0^L auv\dx = \int_0^L f(u)v\dx - Cv(0),\quad \forall v\in V\tp \tag{38} \end{equation} $$
To derive the algebraic equations, we note that \( \forall v\in V \) is equivalent with \( v=\baspsi_i \) for \( i\in\If \). Setting \( u=D+\sum_jc_j\baspsi_j \) and sorting terms results in the linear system $$ \begin{equation} \sum_{j\in\If}\left( \int_0^L \dfc(D+\sum_{k\in\If}c_k\baspsi_k) \baspsi_j^{\prime}\baspsi_i^{\prime}\dx\right)c_j = \int_0^L f(D+\sum_{k\in\If}c_k\baspsi_k)\baspsi_i\dx - C\baspsi_i(0),\quad i\in\If\tp \end{equation} $$
Methods that use the Galerkin or weighted residual principle face a fundamental difficulty in nonlinear problems: how can we integrate a terms like \( \int_0^L \dfc(\sum_{k}c_k\baspsi_k)\baspsi_i^{\prime}\baspsi_j^{\prime}\dx \) and \( \int_0^L f(\sum_{k}c_k\baspsi_k)\baspsi_i\dx \) when we do not know the \( c_k \) coefficients in the argument of the \( \dfc \) function? We can resort to numerical integration, provided an approximate \( \sum_kc_k\baspsi_k \) can be used for the argument \( u \) in \( f \) and \( \dfc \). This is the approach used in computer programs.
However, if we want to look more mathematically into the structure of the algebraic equations generated by the finite element method in nonlinear problems, and compare such equations with those arising in the finite difference method, we need techniques that enable integration of expressions like \( \int_0^L f(\sum_{k}c_k\baspsi_k)\baspsi_i\dx \) by hand. Two such techniques will be shown: the group finite element and numerical integration based on the nodes only. Both techniques are approximate, but they allow us to see the difference equations in the finite element method.
Introduction of finite element basis functions \( \basphi_i \) means setting $$ \baspsi_i = \basphi_{\nu(i)},\quad i\in\If,$$ where degree of freedom number \( \nu(i) \) in the mesh corresponds to unknown number \( i \) (\( c_i \)). In the present example, we use all the basis functions except the last at \( i=N_n-1 \), i.e., \( \If = \{0,\ldots,N_n-2\} \), and \( \nu(j)=j \). The expansion of \( u \) can be taken as $$ u = D + \sum_{j\in\If} c_j\basphi_{\nu(j)},$$ but it is more common in a finite element context to use a boundary function \( B=\sum_{j\in\Ifb}U_j\basphi_j \), where \( U_j \) are prescribed Dirichlet conditions for degree of freedom number \( j \) and \( U_j \) is the corresponding value. $$ u = D\basphi_{N_n-1} + \sum_{j\in\If} c_j\basphi_{\nu(j)}\tp $$ In the general case with \( u \) prescribed as \( U_j \) at some nodes \( j\in\Ifb \), we set $$ u = \sum_{j\in\Ifb} U_j\basphi_j + \sum_{j\in\If}c_j\basphi_{\nu(j)},$$ where \( c_j = u(x^{\nu(j)}) \). That is, \( \nu(j) \) maps unknown number \( j \) to the corresponding node number \( \nu(j) \) such that \( c_j = u(x^{\nu(j)}) \).
Since we already expand \( u \) as \( \sum_j\basphi_j u(\xno{j}) \), we may use the same approximation for other functions as well. For example, $$ f(u)\approx \sum_{j} f(\xno{j})\basphi_j, $$ where \( f(\xno{j}) \) is the value of \( f \) at node \( j \). Since \( f \) is a function of \( u \), \( f(\xno{j})=f(u(\xno{j})) \). Introducing \( u_j \) as a short form for \( u(\xno{j}) \), we can write $$ f(u)\approx \sum_{j} f(u_{j})\basphi_j\tp $$ This approximation is known as the group finite element method or the product approximation technique. The index \( j \) runs over all node numbers in the mesh.
The principal advantages of the group finite element method are two-fold:
Note, however, that an expression like \( \int f(u)\basphi_i\dx \) causes no problems in a computer program as the integral is calculated by numerical integration using an existing approximation of \( u \) in \( f(u) \) such that the integrand can be sampled at any spatial point.
Our aim now is the derive symbolic expressions for the difference equations arising from the finite element method in nonlinear problems and compare the expressions with those arising in the finite difference method. To this, let us simplify the model problem and set \( a=0 \), \( \dfc=1 \), \( f(u)=u^2 \), and have Neumann conditions at both ends such that we get a very simple nonlinear problem \( -u^{\prime\prime}=u^2 \), \( u'(0)=1 \), \( u'(L)=0 \). The variational form is then $$ \int_0^L u^{\prime}v^{\prime}\dx = \int_0^L u^2v\dx - v(0),\quad\forall v\in V\tp$$ The term with \( u^{\prime}v^{\prime} \) is well known so the only new feature is the term \( \int u^2v\dx \).
To make the distance from finite element equations to finite difference equations as short as possible, we shall substitute \( c_j \) in the sum \( u=\sum_jc_j\basphi_j \) by \( u_j=u(\xno{j}) \) since \( c_j \) is the value of \( u \) at node \( j \). (In the more general case with Dirichlet conditions as well, we have a sum \( \sum_jc_j\basphi_{\nu(j)} \) where \( c_j \) is replaced by \( u(\xno{\nu(j)}) \). We can then introduce some other counter \( k \) such that it is meaningful to write \( u=\sum_k u_k\basphi_k \), where \( k \) runs over appropriate node numbers.) The quantity \( u_j \) in \( \sum_ju_j\basphi_j \) is the same as \( u \) at mesh point number \( j \) in the finite difference method, which is commonly denoted \( u_j \).
Consider the term \( \int u^2v\dx \) in the variational formulation with \( v=\basphi_i \) and \( u=\sum_k\basphi_ku_k \): $$ \int_0^L (\sum_ku_k\basphi_k)^2\basphi_i\dx\tp$$ Evaluating this integral for P1 elements (see Problem 10: Integrate functions of finite element expansions) results in the expression $$ \frac{h}{12}(u_{i-1}^2 + 2u_i(u_{i-1} + u_{i+1}) + 6u_i^2 + u_{i+1}^2),$$ to be compared with the simple value \( u_i^2 \) that would arise in a finite difference discretization when \( u^2 \) is sampled at mesh point \( x_i \). More complicated \( f(u) \) functions give rise to much more lengthy expressions, if it is possible to carry out the integral symbolically at all.
Let use the group finite element method to derive the terms in the difference equation corresponding to \( f(u) \) in the differential equation. We have $$ \int_0^L f(u)\basphi_i\dx \approx \int_0^L (\sum_j \basphi_jf(u_j))\basphi_i\dx = \sum_j \left(\int_0^L \basphi_i\basphi_j\dx\right) f(u_j)\tp$$ We recognize this expression as the mass matrix \( M \), arising from \( \int\basphi_i\basphi_j\dx \), times the vector \( f=(f(u_0),f(u_1),\ldots,) \): \( Mf \). The associated terms in the difference equations are, for P1 elements, $$ \frac{h}{6}(f(u_{i-1}) + 4f(u_i) + f(u_{i+1}))\tp$$ Occasionally, we want to interpret this expression in terms of finite differences, and to this end a rewrite of this expression is convenient: $$ \frac{h}{6}(f(u_{i-1}) + 4f(u_i) + f(u_{i+1})) = h[f(u) - \frac{h^2}{6}D_xD_x f(u)]_i\tp$$ That is, the finite element treatment of \( f(u) \) (when using a group finite element method) gives the same term as in a finite difference approach, \( f(u_i) \), minus a diffusion term which is the 2nd-order discretization of \( \frac{1}{6}h^2f''(x_i) \).
We may lump the mass matrix through integration with the Trapezoidal rule so that \( M \) becomes diagonal in the finite element method. In that case the \( f(u) \) term in the differential equation gives rise to a single term \( hf(u_i) \), just as in the finite difference method.
Let us reconsider a term \( \int f(u)v\dx \) as treated in the previous section, but now we want to integrate this term numerically. Such an approach can lead to easy-to-interpret formulas if we apply a numerical integration rule that samples the integrand at the node points \( \xno{i} \) only, because at such points, \( \basphi_j(\xno{i})=0 \) if \( j\neq i \), which leads to great simplifications.
The term in question takes the form $$ \int_0^L f(\sum_k u_k\basphi_k)\basphi_i\dx\tp$$ Evaluation of the integrand at a node \( \xno{\ell} \) leads to a collapse of the sum \( \sum_k u_k\basphi_k \) to one term because $$ \sum_k u_k\basphi_k(\xno{\ell}) = u_\ell\tp$$ $$ f(\sum_k u_k\underbrace{\basphi_k(\xno{\ell})}_{\delta_{k\ell}}) \underbrace{\basphi_i(\xno{\ell})}_{\delta_{i\ell}} = f(u_\ell)\delta_{i\ell},$$ where we have used the Kronecker delta: \( \delta_{ij}=0 \) if \( i\neq j \) and \( \delta_{ij}=1 \) if \( i=j \).
Considering the Trapezoidal rule for integration, where the integration points are the nodes, we have $$ \int_0^L f(\sum_k u_k\basphi_k)(x)\basphi_i(x)\dx \approx h\sum_{\ell=0}^{N_n} f(u_\ell)\delta_{i\ell} - \mathcal{C} = hf(u_i)\tp $$ This is the same representation of the \( f \) term as in the finite difference method. The term \( \mathcal{C} \) contains the evaluations of the integrand at the ends with weight \( \half \), needed to make a true Trapezoidal rule: $$ \mathcal{C} = \frac{h}{2}f(u_0)\basphi_i(0) + \frac{h}{2}f(u_{N_n-1})\basphi_i(L)\tp$$ The answer \( hf(u_i) \) must therefore be multiplied by \( \half \) if \( i=0 \) or \( i=N_n-1 \). Note that \( \mathcal{C}=0 \) for \( i=1,\ldots,N_n-2 \).
One can alternatively use the Trapezoidal rule on the reference cell and assemble the contributions. It is a bit more labor in this context, but working on the reference cell is safer as that approach is guaranteed to handle discontinuous derivatives of finite element functions correctly (not important in this particular example), while the rule above was derived with the assumption that \( f \) is continuous at the integration points.
The conclusion is that it suffices to use the Trapezoidal rule if one wants to derive the difference equations in the finite element method and make them similar to those arising in the finite difference method. The Trapezoidal rule has sufficient accuracy for P1 elements, but for P2 elements one should turn to Simpson's rule.
Turning back to the model problem (28), it remains to calculate the contribution of the \( (\dfc u^{\prime})^{\prime} \) and boundary terms to the difference equations. The integral in the variational form corresponding to \( (\dfc u^{\prime})^{\prime} \) is $$ \int_0^L \dfc(\sum_k c_k\baspsi_k)\baspsi_i^{\prime}\baspsi_j^{\prime}\dx\tp$$ Numerical integration utilizing a value of \( \sum_k c_k\baspsi_k \) from a previous iteration must in general be used to compute the integral. Now our aim is to integrate symbolically, as much as we can, to obtain some insight into how the finite element method approximates this term. To be able to derive symbolic expressions, we must either turn to the group finite element method or numerical integration in the node points. Finite element basis functions \( \basphi_i \) are now used.
We set \( \dfc(u)\approx \sum_k\alpha(u_k)\basphi_k \), and then we write $$ \int_0^L \dfc(\sum_k c_k\basphi_k)\basphi_i^{\prime}\basphi_j^{\prime}\dx \approx \sum_k (\underbrace{\int_0^L \basphi_k\basphi_i^{\prime}\basphi_j^{\prime}\dx}_{L_{i,j,k}}) \dfc(u_k) = \sum_k L_{i,j,k}\dfc(u_k)\tp $$ Further calculations are now easiest to carry out in the reference cell. With P1 elements we can compute \( L_{i,j,k} \) for the two \( k \) values that are relevant on the reference cell. Turning to local indices, one gets $$ L_{r,s,t}^{(e)} = \frac{1}{2h}\left(\begin{array}{rr} 1 & -1\\ -1 & 1 \end{array}\right),\quad t=0, 1, $$ where \( r,s,t=0,1 \) are indices over local degrees of freedom in the reference cell (\( i=q(e,r) \), \( j=q(e,s) \), and \( k=q(e,t) \)). The sum \( \sum_k L_{i,j,k}\dfc(u_k) \) at the cell level becomes \( \sum_{t=0}^1 L_{r,s,t}^{(e)}\dfc(\tilde u_t) \), where \( \tilde u_t \) is \( u(\xno{q(e,t)}) \), i.e., the value of \( u \) at local node number \( t \) in cell number \( e \). The element matrix becomes $$ \begin{equation} \half (\dfc(\tilde u_0) + \dfc(\tilde u^{(1)})) \frac{1}{h}\left(\begin{array}{rr} 1 & -1\\ -1 & 1 \end{array}\right)\tp \tag{39} \end{equation} $$ As usual, we employ a left-to-right numbering of cells and nodes. Row number \( i \) in the global matrix gets contributions from the first row of the element matrix in cell \( i \) and the last row of the element matrix in cell \( i-1 \). In cell number \( i-1 \) the sum \( \dfc(\tilde u_0) + \dfc(\tilde u^{(1)}) \) corresponds to \( \dfc(u_{i-1}) + \dfc(u_i) \). The same sum becomes \( \dfc(u_{i}) + \dfc(u_{i+1}) \) in cell number \( i \). We can with this insight assemble the contributions to row number \( i \) in the global matrix: $$ \frac{1}{2h}(-(\dfc(u_{i-1}) + \dfc(u_i)),\quad \dfc(u_{i-1}) + 2\dfc(u_i) + \dfc(u_{i+1}),\quad \dfc(u_{i}) + \dfc(u_{i+1}))\tp $$ Multiplying by the vector of unknowns \( u_i \) results in a formula that can be arranged to $$ \begin{equation} -\frac{1}{h}(\half(\dfc(u_i) + \dfc(u_{i+1}))(u_{i+1}-u_i) - \half(\dfc(u_{i-1}) + \dfc(u_{i}))(u_{i}-u_{i-1})), \tag{40} \end{equation} $$ which is nothing but the standard finite difference discretization of \( -(\dfc(u)u^{\prime})^{\prime} \) with an arithmetic mean of \( \dfc(u) \) (and the usual factor \( h \) because of the integration in the finite element method).
Instead of using the group finite element method and exact integration we can turn to the Trapezoidal rule for computing \( \int_0^L \dfc(\sum_k u_k\basphi_k)\basphi_i^{\prime}\basphi_j^{\prime}\dx \), again at the cell level since that is most convenient when we deal with discontinuous functions \( \basphi_i' \): $$ \begin{align} \int_{-1}^1 \alpha(\sum_t\tilde u_t\refphi_t)\refphi_r'\refphi_s'\frac{h}{2}dX &= \int_{-1}^1 \dfc(\sum_{t=0}^1 \tilde u_t\refphi_t)\frac{2}{h}\frac{d\refphi_r}{dX} \frac{2}{h}\frac{d\refphi_s}{dX}\frac{h}{2}dX\nonumber\\ & = \frac{1}{2h}(-1)^r(-1)^s \int_{-1}^1 \dfc(\sum_{t=0}^1 u_t\refphi_t(X))dX \nonumber\\ & \approx \frac{1}{2h}(-1)^r(-1)^s\dfc ( \sum_{t=0}^1\refphi_t(-1)\tilde u_t) + \dfc(\sum_{t=0}^1\refphi_t(1)\tilde u_t) \nonumber\\ & = \frac{1}{2h}(-1)^r(-1)^s(\dfc(\tilde u_0) + \dfc(\tilde u^{(1)}))\tp \tag{41} \end{align} $$ The element matrix in (41) is identical to the one in (39), showing that the group finite element method and Trapezoidal integration are equivalent with a standard finite discretization of a nonlinear Laplace term \( (\dfc(u)u^{\prime})^{\prime} \) using an arithmetic mean for \( \dfc \): \( [D_x\overline{x}D_xu]_i \).
We might comment on integration in the physical coordinate system too. The common Trapezoidal rule in the section Numerical integration of nonlinear terms cannot be used to integrate derivatives like \( \basphi_i^{\prime} \), because the formula is derived under the assumption of a continuous integrand. One must instead use the more basic version of the Trapezoidal rule where all the trapezoids are summed up. This is straightforward, but I think it is even more straightforward to apply the Trapezoidal rule on the reference cell and assemble the contributions.
The term \( \int auv\dx \) in the variational form is linear and gives these terms in the algebraic equations: $$ \frac{ah}{6}(u_{i-1} + 4u_i + u_{i+1}) = ah[u - \frac{h^2}{6}D_xD_x u]_i\tp$$ The final term in the variational form is the Neumann condition at the boundary: \( Cv(0)=C\basphi_i(0) \). With a left-to-right numbering only \( i=0 \) will give a contribution \( Cv(0)=C\delta_{i0} \) (since \( \basphi_i(0)\neq 0 \) only for \( i=0 \)).
For the equation $$ -(\dfc(u)u^{\prime})^{\prime} +au = f(u),$$ P1 finite elements results in difference equations where
As we now have explicit expressions for the nonlinear difference equations also in the finite element method, a Picard or Newton method can be defined as shown for the finite difference method. However, our efforts in deriving symbolic forms of the difference equations in the finite element method was motivated by a desire to see how nonlinear terms in differential equations make the finite element and difference method different. For practical calculations in computer programs we apply numerical integration, normally the more accurate Gauss-Legendre quadrature rules, to the integrals directly. This allows us to easily evaluate the nonlinear algebraic equations for a given numerical approximation of \( u \) (here denoted \( u^{-} \)). To solve the nonlinear algebraic equations we need to apply the Picard iteration method or Newton's method to the variational form directly, as shown next.
We address again the problem (28) with the corresponding variational form (38). Our aim is to define a Picard iteration based on this variational form without any attempt to compute integrals symbolically as in the previous three sections. The idea in Picard iteration is to use a previously computed \( u \) value in the nonlinear functions \( \dfc(u) \) and \( f(u) \). Let \( u^{-} \) be the available approximation to \( u \) from the previous iteration. The linearized variational form for Picard iteration is then $$ \begin{equation} \int_0^L (\dfc(u^{-})u^{\prime}v^{\prime} + auv)\dx = \int_0^L f(u^{-})v\dx - Cv(0),\quad \forall v\in V\tp \tag{42} \end{equation} $$ This is a linear problem \( a(u,v)=L(v) \) with bilinear and linear forms $$ a(u,v) = \int_0^L (\dfc(u^{-})u^{\prime}v^{\prime} + auv)\dx,\quad L(v) = \int_0^L f(u^{-})v\dx - Cv(0)\tp$$ Make sure to distinguish the coefficient \( a \) in \( auv \) from the differential equation from the \( a \) in the abstract bilinear form notation \( a(\cdot,\cdot) \).
The linear system associated with (42) is computed the standard way. Technically, we are back to solving \( -(\dfc(x)u^{\prime})^{\prime} + au=f(x) \). The unknown \( u \) is sought on the form \( u = B(x) + \sum_{j\in\If}c_j\baspsi_j \), with \( B(x)=D \) and \( \baspsi_i = \basphi_{\nu(i)} \), \( \nu(i)=i+1 \), and \( \If = \{0,1,\ldots,N=N_n-2\} \).
Application of Newton's method to the nonlinear variational form (38) arising from the problem (28) requires identification of the nonlinear algebraic equations \( F_i=0 \). Although we originally denoted the unknowns in nonlinear algebraic equations by \( u_0,\ldots,u_N \), it is in the present context most natural to have the unknowns as \( c_0,\ldots,c_N \) and write $$ F_i(c_0,\ldots,c_N)=0, \quad i\in\If,$$ and define the Jacobian as \( J_{i,j}=\partial F_i/\partial c_j \) for \( i,j\in\If \).
The specific form of the equations \( F_i=0 \) follows from the variational form $$ \int_0^L (\dfc(u)u^{\prime}v^{\prime} + auv)\dx = \int_0^L f(u)v\dx - Cv(0),\quad \forall v\in V, $$ by choosing \( v=\baspsi_i \), \( i\in\If \), and setting \( u=\sum_{j\in\If}c_j\baspsi_j \), maybe with a boundary function to incorporate Dirichlet conditions.
With \( v=\baspsi_i \) we get $$ \begin{equation} F_i = \int_0^L (\dfc(u)u^{\prime}\baspsi_i^{\prime} + au\baspsi_i - f(u)\baspsi_i)\dx + C\baspsi_i(0)=0,\quad i\in\If\tp \tag{43} \end{equation} $$ In the differentiations leading to the Jacobian we will frequently use the results $$ \frac{\partial u}{\partial c_j} = \frac{\partial}{\partial c_j} \sum_kc_k\baspsi_k = \baspsi_j,\quad \frac{\partial u^{\prime}}{\partial c_j} = \frac{\partial}{\partial c_j} \sum_kc_k\baspsi_k^{\prime} = \baspsi_j^{\prime}\tp$$ The derivation of the Jacobian of (43) goes as $$ \begin{align} J_{i,j} = \frac{\partial F_i}{\partial c_j} & = \int_0^L \frac{\partial}{\partial c_j} (\dfc(u)u^{\prime}\baspsi_i^{\prime} + au\baspsi_i - f(u)\baspsi_i)\dx\nonumber\\ &= \int_0^L ((\dfc^{\prime}(u)\frac{\partial u}{\partial c_j}u^{\prime} + \dfc(u)\frac{\partial u^{\prime}}{\partial c_j})\baspsi_i^{\prime} + a\frac{\partial u}{\partial c_j}\baspsi_i - f^{\prime}(u)\frac{\partial u}{\partial c_j}\baspsi_i)\dx\nonumber\\ &= \int_0^L ((\dfc^{\prime}(u)\baspsi_ju^{\prime} + \dfc(u)\baspsi_j^{\prime})\baspsi_i^{\prime} + a\baspsi_j\baspsi_i - f^{\prime}(u)\baspsi_j\baspsi_i)\dx\nonumber\\ &= \int_0^L (\dfc^{\prime}(u)u^{\prime}\baspsi_i^{\prime}\baspsi_j + \dfc(u)\baspsi_i^{\prime}\baspsi_j^{\prime} + (a - f(u))\baspsi_i\baspsi_j)\dx \tag{44} \end{align} $$
When calculating the right-hand side vector \( F_i \) and the coefficient matrix \( J_{i,j} \) in the linear system to be solved in each Newton iteration, one must use a previously computed \( u \), denoted by \( u^{-} \), for the symbol \( u \) in (43) and (44). With this notation we have $$ \begin{align} F_i &= \int_0^L\left( \dfc(u^{-})u^{-\prime}\baspsi_i^{\prime} + (a-f(u^{-}))\baspsi_i\right)\dx - C\baspsi_i(0),\quad i\in\If, \tag{45}\\ J_{i,j} &= \int_0^L (\dfc^{\prime}(u^{-})u^{-\prime}\baspsi_i^{\prime}\baspsi_j + \dfc(u^{-})\baspsi_i^{\prime}\baspsi_j^{\prime} + (a - f(u^{-}))\baspsi_i\baspsi_j)\dx, \quad i,j\in\If\tp \tag{46} \end{align} $$ These expressions can be used for any basis \( \sequencei{\baspsi} \). Choosing finite element functions for \( \baspsi_i \), one will normally want to compute the integral contribution cell by cell, working in a reference cell. To this end, we restrict the integration to one cell and transform the cell to \( [-1,1] \). The most recently computed approximation \( u^{-} \) to \( u \) becomes \( \tilde u^{-}=\sum_t\tilde u^{-1}_t\refphi_t(X) \) over the reference element, where \( \tilde u^{-1}_t \) is the value of \( u^{-} \) at global node (or degree of freedom) \( q(e,t) \) corresponding to the local node \( t \) (or degree of freedom). The formulas (45) and (46) then change to $$ \begin{align} \tilde F_r^{(e)} &= \int_{-1}^1\left( \dfc(\tilde u^{-})\tilde u^{-\prime}\refphi_r^{\prime} + (a-f(\tilde u^{-}))\refphi_r\right)\det J\dX - C\refphi_r(0), \tag{47}\\ \tilde J_{r,s}^{(e)} &= \int_{-1}^1 (\dfc^{\prime}(\tilde u^{-})\tilde u^{-\prime}\refphi_r^{\prime}\refphi_s + \dfc(\tilde u^{-})\refphi_r^{\prime}\refphi_s^{\prime} + (a - f(\tilde u^{-}))\refphi_r\refphi_s)\det J\dX, \tag{48} \end{align} $$ with \( r,s\in\Ifd \) runs over the local degrees of freedom.
Many finite element programs require the user to provide \( F_i \) and \( J_{i,j} \). Some programs, like FEniCS, are capable of automatically deriving \( J_{i,j} \) if \( F_i \) is specified.
Incorporation of the Dirichlet values by assembling contributions from all degrees of freedom and then modifying the linear system can be obviously be applied to Picard iteration as that method involves a standard linear system. In the Newton system, however, the unknown is a correction \( \delta u \) to the solution. Dirichlet conditions are implemented by inserting them in the initial guess \( u^{-} \) for the Newton iteration and implementing \( \delta u_i =0 \) for all known degrees of freedom. The manipulation of the linear system follows exactly the algorithm in the linear problems, the only difference being that the known values are zero.