Variational formulations in 2D and 3D

The major difference between deriving variational formulations in 2D and 3D compared to 1D is the rule for integrating by parts. The cells have shapes different from an interval, so basis functions look a bit different, and there is a technical difference in actually calculating the integrals over cells. Otherwise, going to 2D and 3D is not a big step from 1D. All the fundamental ideas still apply.

Integration by parts

A typical second-order term in a PDE may be written in dimension-independent notation as

\[\nabla^2 u \quad\hbox{or}\quad \nabla\cdot\left( {\alpha}(\boldsymbol{x})\nabla u\right) {\thinspace .}\]

The explicit forms in a 2D problem become

\[\nabla^2 u = \nabla\cdot\nabla u = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2},\]

and

\[\nabla\cdot\left( a(\boldsymbol{x})\nabla u\right) = \frac{\partial}{\partial x}\left( {\alpha}(x,y)\frac{\partial u}{\partial x}\right) + \frac{\partial}{\partial y}\left( {\alpha}(x,y)\frac{\partial u}{\partial y}\right) {\thinspace .}\]

We shall continue with the latter operator as the former arises from just setting \({\alpha} =1\).

The integration by parts formula for \(\int\nabla\cdot({\alpha}\nabla)\)

The general rule for integrating by parts is often referred to as Green’s first identity:

\[ \tag{82} -\int_{\Omega} \nabla\cdot ({\alpha}(\boldsymbol{x})\nabla u) v{\, \mathrm{d}x} = \int_{\Omega} {\alpha}(\boldsymbol{x})\nabla u\cdot\nabla v {\, \mathrm{d}x} - \int_{\partial\Omega} a\frac{\partial u}{\partial n} v {\, \mathrm{d}s},\]\[where \( \partial\Omega \) is the boundary of \( \Omega \) and \( \partial u/\partial n = \boldsymbol{n}\cdot\nabla u \) is the derivative of \( u \) in the outward normal direction, \( \boldsymbol{n} \) being an outward unit normal to \( \partial\Omega \) . The integrals \( \int_\Omega (){\, \mathrm{d}x} \) are area integrals in 2D and volume integrals in 3D, while \( \int_{\partial\Omega} (){\, \mathrm{d}s} \) is a line integral in 2D and a surface integral in 3D.\]

It will be convenient to divide the boundary into two parts:

  • \(\partial\Omega_N\), where we have Neumann conditions \(-a\frac{\partial u}{\partial n} = g\), and
  • \(\partial\Omega_D\), where we have Dirichlet conditions \(u = u_0\).

The test functions \(v\) are (as usual) required to vanish on \(\partial\Omega_D\).

Example on a multi-dimensional variational problem

Here is a quite general, stationary, linear PDE arising in many problems:

\[\tag{83} \boldsymbol{v}\cdot\nabla u + \beta u = \nabla\cdot\left( {\alpha}\nabla u\right) + f, \quad\boldsymbol{x}\in\Omega,\]
\[\tag{84} u = u_0,\quad\boldsymbol{x}\in\partial\Omega_D,\]
\[\tag{85} -{\alpha}\frac{\partial u}{\partial n} = g,\quad\boldsymbol{x}\in\partial\Omega_N {\thinspace .}\]

The vector field \(\boldsymbol{v}\) and the scalar functions \(a\), \(\alpha\), \(f\), \(u_0\), and \(g\) may vary with the spatial coordinate \(\boldsymbol{x}\) and must be known.

Such a second-order PDE needs exactly one boundary condition at each point of the boundary, so \(\partial\Omega_N\cup\partial\Omega_D\) must be the complete boundary \(\partial\Omega\).

Assume that the boundary function \(u_0(\boldsymbol{x})\) is defined for all \(\boldsymbol{x}\in\Omega\). The unknown function can then be expanded as

\[u = B + \sum_{j\in{\mathcal{I}_s}} c_j{\psi}_j,\quad B = u_0 {\thinspace .}\]

As long as any \({\psi}_j=0\) on \(\partial\Omega_D\), we realize that \(u=u_0\) on \(\partial\Omega_D\).

The variational formula is obtained from Galerkin’s method, which technically means multiplying the PDE by a test function \(v\) and integrating over \(\Omega\):

\[\int_{\Omega} (\boldsymbol{v}\cdot\nabla u + \beta u)v{\, \mathrm{d}x} = \int_{\Omega} \nabla\cdot\left( {\alpha}\nabla u\right){\, \mathrm{d}x} + \int_{\Omega}fv {\, \mathrm{d}x} {\thinspace .}\]

The second-order term is integrated by parts, according to the formula (82):

\[\int_{\Omega} \nabla\cdot\left( {\alpha}\nabla u\right)v {\, \mathrm{d}x} = -\int_{\Omega} {\alpha}\nabla u\cdot\nabla v{\, \mathrm{d}x} + \int_{\partial\Omega} {\alpha}\frac{\partial u}{\partial n} v{\, \mathrm{d}s} {\thinspace .}\]

Galerkin’s method therefore leads to

\[\int_{\Omega} (\boldsymbol{v}\cdot\nabla u + \beta u)v{\, \mathrm{d}x} = -\int_{\Omega} {\alpha}\nabla u\cdot\nabla v{\, \mathrm{d}x} + \int_{\partial\Omega} {\alpha}\frac{\partial u}{\partial n} v{\, \mathrm{d}s} + \int_{\Omega} fv {\, \mathrm{d}x} {\thinspace .}\]

The boundary term can be developed further by noticing that \(v\neq 0\) only on \(\partial\Omega_N\),

\[\int_{\partial\Omega} {\alpha}\frac{\partial u}{\partial n} v{\, \mathrm{d}s} = \int_{\partial\Omega_N} {\alpha}\frac{\partial u}{\partial n} v{\, \mathrm{d}s},\]

and that on \(\partial\Omega_N\), we have the condition \(a\frac{\partial u}{\partial n}=-g\), so the term becomes

\[-\int_{\partial\Omega_N} gv{\, \mathrm{d}s}{\thinspace .}\]

The final variational form is then

\[\int_{\Omega} (\boldsymbol{v}\cdot\nabla u + \beta u)v{\, \mathrm{d}x} = -\int_{\Omega} {\alpha}\nabla u\cdot\nabla v {\, \mathrm{d}x} - \int_{\partial\Omega_N} g v{\, \mathrm{d}s} + \int_{\Omega} fv {\, \mathrm{d}x} {\thinspace .}\]

Instead of using the integral signs, we may use the inner product notation:

\[(\boldsymbol{v}\cdot\nabla u, v) + (\beta u,v) = - ({\alpha}\nabla u,\nabla v) - (g,v)_{N} + (f,v) {\thinspace .}\]

The subscript \(\,{}_N\) in \((g,v)_{N}\) is a notation for a line or surface integral over \(\partial\Omega_N\), while \((\cdot,\cdot)\) is the area/volume integral over \(\Omega\).

We can derive explicit expressions for the linear system for \(\left\{ {c}_j \right\}_{j\in{\mathcal{I}_s}}\) that arises from the variational formulation. Inserting the \(u\) expansion results in

\[\begin{split}\sum_{j\in{\mathcal{I}_s}} ((\boldsymbol{v}\cdot\nabla {\psi}_j, {\psi}_i) &+ (\beta {\psi}_j ,{\psi}_i) + ({\alpha}\nabla {\psi}_j,\nabla {\psi}_i))c_j = \\ & (g,{\psi}_i)_{N} + (f,{\psi}_i) - (\boldsymbol{v}\cdot\nabla u_0, {\psi}_i) + (\beta u_0 ,{\psi}_i) + ({\alpha}\nabla u_0,\nabla {\psi}_i) {\thinspace .}\end{split}\]

This is a linear system with matrix entries

\[A_{i,j} = (\boldsymbol{v}\cdot\nabla {\psi}_j, {\psi}_i) + (\beta {\psi}_j ,{\psi}_i) + ({\alpha}\nabla {\psi}_j,\nabla {\psi}_i)\]

and right-hand side entries

\[b_i = (g,{\psi}_i)_{N} + (f,{\psi}_i) - (\boldsymbol{v}\cdot\nabla u_0, {\psi}_i) + (\beta u_0 ,{\psi}_i) + ({\alpha}\nabla u_0,\nabla {\psi}_i),\]

for \(i,j\in{\mathcal{I}_s}\).

In the finite element method, we usually express \(u_0\) in terms of basis functions and restrict \(i\) and \(j\) to run over the degrees of freedom that are not prescribed as Dirichlet conditions. However, we can also keep all the \(\left\{ {c}_j \right\}_{j\in{\mathcal{I}_s}}\) as unknowns, drop the \(u_0\) in the expansion for \(u\), and incorporate all the known \(c_j\) values in the linear system. This has been explained in detail in the 1D case, and the technique is the same for 2D and 3D problems.

Transformation to a reference cell in 2D and 3D

The real power of the finite element method first becomes evident when we want to solve partial differential equations posed on two- and three-dimensional domains of non-trivial geometric shape. As in 1D, the domain \(\Omega\) is divided into \(N_e\) non-overlapping cells. The elements have simple shapes: triangles and quadrilaterals are popular in 2D, while tetrahedra and box-shapes elements dominate in 3D. The finite element basis functions \({\varphi}_i\) are, as in 1D, polynomials over each cell. The integrals in the variational formulation are, as in 1D, split into contributions from each cell, and these contributions are calculated by mapping a physical cell, expressed in physical coordinates \(\boldsymbol{x}\), to a reference cell in a local coordinate system \(\boldsymbol{X}\). This mapping will now be explained in detail.

We consider an integral of the type

\[\tag{86} \int_{{\Omega}^{(e)}} {\alpha}(\boldsymbol{x})\nabla{\varphi}_i\cdot\nabla{\varphi}_j{\, \mathrm{d}x},\]

where the \({\varphi}_i\) functions are finite element basis functions in 2D or 3D, defined in the physical domain. Suppose we want to calculate this integral over a reference cell, denoted by \(\tilde\Omega^r\), in a coordinate system with coordinates \(\boldsymbol{X} = (X_0, X_1)\) (2D) or \(\boldsymbol{X} = (X_0, X_1, X_2)\) (3D). The mapping between a point \(\boldsymbol{X}\) in the reference coordinate system and the corresponding point \(\boldsymbol{x}\) in the physical coordinate system is given by a vector relation \(\boldsymbol{x}(\boldsymbol{X})\). The corresponding Jacobian, \(J\), of this mapping has entries

\[J_{i,j}=\frac{\partial x_j}{\partial X_i}{\thinspace .}\]

The change of variables requires \({\, \mathrm{d}x}\) to be replaced by \(\det J{\, \mathrm{d}X}\). The derivatives in the \(\nabla\) operator in the variational form are with respect to \(\boldsymbol{x}\), which we may denote by \(\nabla_{\boldsymbol{x}}\). The \({\varphi}_i(\boldsymbol{x})\) functions in the integral are replaced by local basis functions \({\tilde{\varphi}}_r(\boldsymbol{X})\) so the integral features \(\nabla_{\boldsymbol{x}}{\tilde{\varphi}}_r(\boldsymbol{X})\). We readily have \(\nabla_{\boldsymbol{X}}{\tilde{\varphi}}_r(\boldsymbol{X})\) from formulas for the basis functions in the reference cell, but the desired quantity \(\nabla_{\boldsymbol{x}}{\tilde{\varphi}}_r(\boldsymbol{X})\) requires some efforts to compute. All the details are provided below.

Let \(i=q(e,r)\) and consider two space dimensions. By the chain rule,

\[\frac{\partial {\tilde{\varphi}}_r}{\partial X} = \frac{\partial {\varphi}_i}{\partial X} = \frac{\partial {\varphi}_i}{\partial x}\frac{\partial x}{\partial X} + \frac{\partial {\varphi}_i}{\partial y}\frac{\partial y}{\partial X},\]

and

\[\frac{\partial {\tilde{\varphi}}_r}{\partial Y} = \frac{\partial {\varphi}_i}{\partial Y} = \frac{\partial {\varphi}_i}{\partial x}\frac{\partial x}{\partial Y} + \frac{\partial {\varphi}_i}{\partial y}\frac{\partial y}{\partial Y} {\thinspace .}\]

We can write these two equations as a vector equation

\[\begin{split}\left[\begin{array}{c} \frac{\partial {\tilde{\varphi}}_r}{\partial X}\\ \frac{\partial {\tilde{\varphi}}_r}{\partial Y} \end{array}\right] = \left[\begin{array}{cc} \frac{\partial x}{\partial X} & \frac{\partial y}{\partial X}\\ \frac{\partial x}{\partial Y} & \frac{\partial y}{\partial Y} \end{array}\right] \left[\begin{array}{c} \frac{\partial {\varphi}_i}{\partial x}\\ \frac{\partial {\varphi}_i}{\partial y} \end{array}\right]\end{split}\]

Identifying

\[\begin{split}\nabla_{\boldsymbol{X}}{\tilde{\varphi}}_r = \left[\begin{array}{c} \frac{\partial {\tilde{\varphi}}_r}{\partial X}\\ \frac{\partial {\tilde{\varphi}}_r}{\partial Y} \end{array}\right], \quad J = \left[\begin{array}{cc} \frac{\partial x}{\partial X} & \frac{\partial y}{\partial X}\\ \frac{\partial x}{\partial Y} & \frac{\partial y}{\partial Y} \end{array}\right], \quad \nabla_{\boldsymbol{x}}{\varphi}_r = \left[\begin{array}{c} \frac{\partial {\varphi}_i}{\partial x}\\ \frac{\partial {\varphi}_i}{\partial y} \end{array}\right],\end{split}\]

we have the relation

\[\nabla_{\boldsymbol{X}}{\tilde{\varphi}}_r = J\cdot\nabla_{\boldsymbol{x}}{\varphi}_i,\]

which we can solve with respect to \(\nabla_{\boldsymbol{x}}{\varphi}_i\):

\[\tag{87} \nabla_{\boldsymbol{x}}{\varphi}_i = J^{-1}\cdot\nabla_{\boldsymbol{X}}{\tilde{\varphi}}_r{\thinspace .}\]

On the reference cell, \({\varphi}_i(\boldsymbol{x}) = {\tilde{\varphi}}_r(\boldsymbol{X})\), so

\[\tag{88} \nabla_{\boldsymbol{x}}{\tilde{\varphi}}_r(\boldsymbol{X}) = J^{-1}(\boldsymbol{X})\cdot\nabla_{\boldsymbol{X}}{\tilde{\varphi}}_r(\boldsymbol{X}){\thinspace .}\]

This means that we have the following transformation of the integral in the physical domain to its counterpart over the reference cell:

\[\tag{89} \int_{\Omega}^{(e)} {\alpha}(\boldsymbol{x})\nabla_{\boldsymbol{x}}{\varphi}_i\cdot\nabla_{\boldsymbol{x}}{\varphi}_j{\, \mathrm{d}x} \int_{\tilde\Omega^r} {\alpha}(\boldsymbol{x}(\boldsymbol{X}))(J^{-1}\cdot\nabla_{\boldsymbol{X}}{\tilde{\varphi}}_r)\cdot (J^{-1}\cdot\nabla{\tilde{\varphi}}_s)\det J{\, \mathrm{d}X}\]

Numerical integration

Integrals are normally computed by numerical integration rules. For multi-dimensional cells, various families of rules exist. All of them are similar to what is shown in 1D: \(\int f {\, \mathrm{d}x}\approx \sum_jw_if(\boldsymbol{x}_j)\), where \(w_j\) are weights and \(\boldsymbol{x}_j\) are corresponding points.

The file numint.py contains the functions quadrature_for_triangles(n) and quadrature_for_tetrahedra(n), which returns lists of points and weights corresponding to integration rules with n points over the reference triangle with vertices \((0,0)\), \((1,0)\), \((0,1)\), and the reference tetrahedron with vertices \((0,0,0)\), \((1,0,0)\), \((0,1,0)\), \((0,0,1)\), respectively. For example, the first two rules for integration over a triangle have 1 and 3 points:

>>> import numint
>>> x, w = numint.quadrature_for_triangles(num_points=1)
>>> x
[(0.3333333333333333, 0.3333333333333333)]
>>> w
[0.5]
>>> x, w = numint.quadrature_for_triangles(num_points=3)
>>> x
[(0.16666666666666666, 0.16666666666666666),
 (0.66666666666666666, 0.16666666666666666),
 (0.16666666666666666, 0.66666666666666666)]
>>> w
[0.16666666666666666, 0.16666666666666666, 0.16666666666666666]

Rules with 1, 3, 4, and 7 points over the triangle will exactly integrate polynomials of degree 1, 2, 3, and 4, respectively. In 3D, rules with 1, 4, 5, and 11 points over the tetrahedron will exactly integrate polynomials of degree 1, 2, 3, and 4, respectively.

Convenient formulas for P1 elements in 2D

We shall now provide some formulas for piecewise linear \({\varphi}_i\) functions and their integrals in the physical coordinate system. These formulas make it convenient to compute with P1 elements without the need to work in the reference coordinate system and deal with mappings and Jacobians. A lot of computational and algorithmic details are hidden by this approach.

Let \(\Omega^{(e)}\) be cell number \(e\), and let the three vertices have global vertex numbers \(I\), \(J\), and \(K\). The corresponding coordinates are \((x_{I},y_{I})\), \((x_{J},y_{J})\), and \((x_{K},y_{K})\). The basis function \({\varphi}_I\) over \(\Omega^{(e)}\) have the explicit formula

\[\tag{90} {\varphi}_I (x,y) = \frac{1}{2}\Delta \left( \alpha_I + \beta_Ix + \gamma_Iy\right),\]

where

\[\tag{91} \alpha_I = x_{J}y_{K} - x_{K}y_{J},\]
\[\tag{92} \beta_I = y_{J} - y_{K},\]
\[\tag{93} \gamma_I = x_{K} - x_{J}, ,\]

and

\[\begin{split}\tag{94} 2\Delta = \det\left(\begin{array}{rrr} 1 & x_{I} & y_{I} \\ 1 & x_{J} & y_{J} \\ 1 & x_{K} & y_{K} \end{array}\right) {\thinspace .}\end{split}\]

The quantity \(\Delta\) is the area of the cell.

The following formula is often convenient when computing element matrices and vectors:

\[\tag{95} \int_{\Omega^{(e)}} {\varphi}_I^{p}{\varphi}_J^{q}{\varphi}_K^{r} dx dy = {p!q!r!\over (p+q+r+2)!}2\Delta\]\[ {\thinspace .}\]

(Note that the \(q\) in this formula is not to be mixed with the \(q(e,r)\) mapping of degrees of freedom.)

As an example, the element matrix entry \(\int_{\Omega^{(e)}} {\varphi}_I{\varphi}_J{\, \mathrm{d}x}\) can be computed by setting \(p=q=1\) and \(r=0\), when \(I\neq J\), yielding \(\Delta/12\), and \(p=2\) and \(q=r=0\), when \(I=J\), resulting in \(\Delta/6\). We collect these numbers in a local element matrix:

\[\begin{split}\frac{\Delta}{12} \left[\begin{array}{ccc} 2 & 1 & 1\\ 1 & 2 & 1\\ 1 & 1 & 2 \end{array}\right]\end{split}\]

The common element matrix entry \(\int_{\Omega^{(e)}} \nabla{\varphi}_I\cdot\nabla{\varphi}_J{\, \mathrm{d}x}\), arising from a Laplace term \(\nabla^2u\), can also easily be computed by the formulas above. We have

\[\nabla{\varphi}_I\cdot\nabla{\varphi}_J = \frac{\Delta^2}{4}(\beta_I\beta_J + \gamma_I\gamma_J) = \hbox{const},\]

so that the element matrix entry becomes \(\frac{1}{4}\Delta^3(\beta_I\beta_J + \gamma_I\gamma_J)\).

From an implementational point of view, one will work with local vertex numbers \(r=0,1,2\), parameterize the coefficients in the basis functions by \(r\), and look up vertex coordinates through \(q(e,r)\).

Similar formulas exist for integration of P1 elements in 3D.

A glimpse of the mathematical theory of the finite element method

Almost all books on the finite element method that introduces the abstract variational problem \(a(u,v)=L(v)\) spend considerable pages on deriving error estimates and other properties of the approximate solution. The machinery with function spaces and bilinear and linear forms has the great advantage that a very large class of PDE problems can be analyzed in a unified way. This feature is often taken as an advantage of finite element methods over finite difference and volume methods. Since there are so many excellent textbooks on the mathematical properties of finite element methods [Ref2] [Ref3] [Ref4] [Ref5] [Ref6] [Ref7], this text will not repeat the theory, but give a glimpse of typical assumptions and general results for elliptic PDEs.

Remark. The mathematical theory of finite element methods is primarily developed for to stationary PDE problems of elliptic nature whose solutions are smooth. However, such problems can be solved with the desired accuracy by most numerical methods and pose no difficulties. Time-dependent problems, on the other hand, easily lead to non-physical features in the numerical solutions and therefore requires more care and knowledge by the user. Our focus on the accuracy of the finite element method will of this reason be centered around time-dependent problems, but then we need a different set of tools for the analysis. These tools are based on converting finite element equations to finite difference form and studying Fourier wave components.

Abstract variational forms

To list the main results from the mathematical theory of finite elements, we consider linear PDEs with an abstract variational form

\[a(u,v) = L(v)\quad\forall v\in V{\thinspace .}\]

This is the discretized problem (as usual in this book) where we seek \(u\in V\). The weak formulation of the corresponding continuous problem, fulfilled by the exact solution \({u_{\small\mbox{e}}}\in\Vex\) is here written as

\[a({u_{\small\mbox{e}}}, v) = L(v)\quad\forall v\in\Vex{\thinspace .}\]

The space \(V\) is finite dimensional (with dimension \(N+1\)), while \(\Vex\) is infinite dimensional. Normally The hope is that \(u\rightarrow{u_{\small\mbox{e}}}\) as \(N\rightarrow\infty\) and \(V\rightarrow\Vex\).

Example on an abstract variational form and associated spaces

Consider the problem \(-u''(x)=f(x)\) on \(\Omega=[0,1]\), with \(u(0)=0\) and \(u'(1)=\beta\). The weak form is

\[a(u,v) = \int_0^1 u'v'dx,\quad L(v)=\int_0^1fvdx + \beta v(1){\thinspace .}\]

The space \(V\) for the approximate solution \(u\) can be chosen in many ways as previously described. The exact solution \({u_{\small\mbox{e}}}\) fulfills \(a(u,v)=L(v)\) for all \(v\) in \(\Vex\), and to specify what \(\Vex\) is, we need to introduce Hilbert spaces. The Hilbert space \(L^2(\Omega)\) consists of all functions that are square-integrable on \(\Omega\):

\[\begin{split}L^2(\Omega) = \left\lbrace\int_\Omega v^2dx < \infty\right\rbrace{\thinspace .}\end{split}\]

The space \(\Vex\) is the space of all functions whose first-order derivative is also square-integrable:

\[\Vex = H^1_0(\Omega) = \left\lbrace v\in L^2(\Omega)\,\vert\, \frac{dv}{dx}\in L^2(\Omega),\hbox{ and }v(0)=0\right\rbrace{\thinspace .}\]

The requirements of square-integrable zeroth- and first-order derivatives are motivated from the formula for \(a(u,v)\) where products of the first-order derivatives are to be integrated on \(\Omega\).

The Sobolev space \(H^1_0(\Omega)\) has an inner product

\[(u,v)_{H^1} = \int_\Omega (uv + \frac{du}{dx}\frac{dv}{dx})dx,\]

and associated norm

\[||v||_{H^1} = \sqrt{(v,v)_{H^1}}{\thinspace .}\]

Assumptions

A set of general results builds on the following assumptions. Let \(\Vex\) be an infinite-dimensional inner-product space such that \({u_{\small\mbox{e}}}\in\Vex\). The space has an associated norm \(||v||\) (e.g., \(||v||_{H^1}\) in the example above with \(\Vex=H^1_0(\Omega)\)).

  1. \(L(v)\) is linear in its argument.
  2. \(a(u,v)\) is a bilinear in its arguments.
  3. \(L(v)\) is bounded (also called continuous) if there exists a positive constant \(c_0\) such that \(|L(v)|\leq c_0||v||\) $forall vin Vex$.
  4. \(a(u,v)\) is bounded (or continuous) if there exists a positive constant \(c_1\) such that \(|a(u,v)|\leq c_1||u|| ||v||\ \forall u,v\in\Vex\).
  5. \(a(u,v\)) is elliptic (or coercive) if there exists a positive constant \(c_2\) such that \(a(v,v)\geq c_2||v||^2\ \forall v\in\Vex\).
  6. \(a(u,v)\) is symmetric: \(a(u,v)=a(v,u)\).

Based on the above assumptions, which must be verified in each specific problem, one can derive some general results that are listed below.

Existence and uniqueness

There exists a unique solution of the problem find \({u_{\small\mbox{e}}}\in\Vex\) such that

\[a({u_{\small\mbox{e}}},v)=L(v)\quad\forall v\in\Vex{\thinspace .}\]

(This result is known as the Lax-Milgram Theorem.)

Stability

The solution \({u_{\small\mbox{e}}}\in\Vex\) obeys the stability estimate

\[||u||\leq \frac{c_0}{c_2}{\thinspace .}\]

Equivalent minimization problem

The solution \({u_{\small\mbox{e}}}\in\Vex\) also fulfills the minimization problem

\[\min_{v\in\Vex} F(v),\quad F(v)=\frac{1}{2}a(v,v) - L(v){\thinspace .}\]

Best approximation principle

The energy norm is defined as

\[||v||_a = \sqrt{a(v,v)}{\thinspace .}\]

The discrete solution \(u\in V\) is the best approximation in energy norm,

\[||{u_{\small\mbox{e}}} - u||_a \leq ||{u_{\small\mbox{e}}} - v||_a\quad\forall v\in V{\thinspace .}\]

This is quite remarkable: once we have \(V\) (i.e., a mesh), the Galerkin method finds the best approximation in this space. In the example above, we have \(||v||_a=\int_0^1 (v')^2dx\), so the derivative \(u'\) is closer to \({u_{\small\mbox{e}}}'\) than any other possible function in \(V\):

\[\int_0^1 ({u_{\small\mbox{e}}}' - u')^2dx \leq \int_0^1(u' - v')dx\quad\forall v\in V{\thinspace .}\]

Best approximation property in the norm of the space

If \(||v||\) is the norm associated with \(\Vex\), we have another best approximation property:

\[||{u_{\small\mbox{e}}} - u||\leq\left(\frac{c_1}{c_2}\right)^{\frac{1}{2}}||{u_{\small\mbox{e}}} - v||\quad\forall v\in\boldsymbol{V}{\thinspace .}\]

Symmetric, positive definite coefficient matrix

The discrete problem \(a(u,v)=L(v)\) $forall vin V$ leads to a linear system \(Ac=b\), where the coefficient matrix \(A\) is symmetric (\(A^T=A\)) and positive definite (\(x^TAx > 0\) for all vectors \(x\neq 0\)). One can then use solution methods that demand less storage and that are faster and more reliable than solvers for general linear systems. One is also guaranteed the existence and uniqueness of the discrete solution \(u\).

Equivalent matrix minimization problem

The solution \(c\) of the linear system \(Ac=b\) also solves the minimization problem \(\min_w(\frac{1}{2} w^TAw - b^Tw\) in the vector space \(\mathbb{R}^{N+1}\).

A priori error estimate for the derivative

In our sample problem, \(-u''=f\) on \(\Omega=[0,1]\), \(u(0)=0\), \(u'(1)=\beta\), one can derive the following error estimate for Lagrange finite element approximations of degree \(s\):

\[\begin{split}\left(\int_0^1 ({u_{\small\mbox{e}}}' - u')^2dx\right)^{\frac{1}{2}}&\leq Ch^s||{u_{\small\mbox{e}}}||_{H^{s+1}},\\ ||{u_{\small\mbox{e}}} - u|| &\leq Ch^2 g({u_{\small\mbox{e}}}),\end{split}\]

where \(||u||_{H^{s+1}}\) is a norm that integrates the sum of the square of all derivatives up to order \(s+1\), \(g({u_{\small\mbox{e}}})\) is the integral of \((u'')^2\), \(C\) is a constant, and \(h\) is the maximum cell length. The estimate shows that choosing elements with higher-degree polynomials (large \(s\)) requires more smoothness in \({u_{\small\mbox{e}}}\) since higher-order derivatives need to be square-integrable.

A consequence of the error estimate is that \(u'\rightarrow {u_{\small\mbox{e}}}'\) as \(h\rightarrow 0\), i.e., the approximate solution converges to the exact one.

The constant \(C\) in this and the next estimate depends on the shape of triangles in 2D and tetrahedra in 3D: squeezed elements with a small angle lead to a large \(C\), and such deformed elements are not favorable for the accuracy.

One can generalize the above estimate to the general problem class \(a(u,v)=L(v)\): the error in the derivative is proportional to \(h^s\). Note that the expression \(||{u_{\small\mbox{e}}} - u||\) in the example is \(||{u_{\small\mbox{e}}} - u||_{H^1}\) so it involves the sum of the zeroth and first derivative. The appearance of the derivative makes the error proportional to \(h^s\) - if we only look at the solution it converges as \(h^{s+1}\) (see below).

The above estimate is called an a priori estimate because the bound contains the exact solution, which is not computable. There are also a posteriori estimates where the bound involves the approximation \(u\), which is available in computations.

A priori error estimate for the solution

The finite element solution of our sample problem, using P1 elements, fulfills

\[\left(\int_0^1 ({u_{\small\mbox{e}}} - u)^2dx\right)^{\frac{1}{2}} \leq Ch^2 g({u_{\small\mbox{e}}}){\thinspace .}\]

This estimate shows that the error converges as \(h^2\) for P1 elements. An equivalent finite difference method, see the section Comparison with a finite difference discretization, is known to have an error proportional to \(h^2\), so the above estimate is expected. In general, the convergence is \(h^{s+1}\) for elements with polynomials of degree \(s\). Note that the estimate for \(u'\) is proportional to \(h\) raised to one power less.