Variational formulations in 2D and 3D¶
The major difference between deriving variational formulations in 2D and 3D compared to 1D is the rule for integrating by parts. The cells have shapes different from an interval, so basis functions look a bit different, and there is a technical difference in actually calculating the integrals over cells. Otherwise, going to 2D and 3D is not a big step from 1D. All the fundamental ideas still apply.
Integration by parts¶
A typical second-order term in a PDE may be written in dimension-independent notation as
The explicit forms in a 2D problem become
and
We shall continue with the latter operator as the former arises from just setting \({\alpha} =1\).
The integration by parts formula for \(\int\nabla\cdot({\alpha}\nabla)\)
The general rule for integrating by parts is often referred to as Green’s first identity:
It will be convenient to divide the boundary into two parts:
- \(\partial\Omega_N\), where we have Neumann conditions \(-a\frac{\partial u}{\partial n} = g\), and
- \(\partial\Omega_D\), where we have Dirichlet conditions \(u = u_0\).
The test functions \(v\) are (as usual) required to vanish on \(\partial\Omega_D\).
Example on a multi-dimensional variational problem¶
Here is a quite general, stationary, linear PDE arising in many problems:
The vector field \(\boldsymbol{v}\) and the scalar functions \(a\), \(\alpha\), \(f\), \(u_0\), and \(g\) may vary with the spatial coordinate \(\boldsymbol{x}\) and must be known.
Such a second-order PDE needs exactly one boundary condition at each point of the boundary, so \(\partial\Omega_N\cup\partial\Omega_D\) must be the complete boundary \(\partial\Omega\).
Assume that the boundary function \(u_0(\boldsymbol{x})\) is defined for all \(\boldsymbol{x}\in\Omega\). The unknown function can then be expanded as
As long as any \({\psi}_j=0\) on \(\partial\Omega_D\), we realize that \(u=u_0\) on \(\partial\Omega_D\).
The variational formula is obtained from Galerkin’s method, which technically means multiplying the PDE by a test function \(v\) and integrating over \(\Omega\):
The second-order term is integrated by parts, according to the formula (82):
Galerkin’s method therefore leads to
The boundary term can be developed further by noticing that \(v\neq 0\) only on \(\partial\Omega_N\),
and that on \(\partial\Omega_N\), we have the condition \(a\frac{\partial u}{\partial n}=-g\), so the term becomes
The final variational form is then
Instead of using the integral signs, we may use the inner product notation:
The subscript \(\,{}_N\) in \((g,v)_{N}\) is a notation for a line or surface integral over \(\partial\Omega_N\), while \((\cdot,\cdot)\) is the area/volume integral over \(\Omega\).
We can derive explicit expressions for the linear system for \(\left\{ {c}_j \right\}_{j\in{\mathcal{I}_s}}\) that arises from the variational formulation. Inserting the \(u\) expansion results in
This is a linear system with matrix entries
and right-hand side entries
for \(i,j\in{\mathcal{I}_s}\).
In the finite element method, we usually express \(u_0\) in terms of basis functions and restrict \(i\) and \(j\) to run over the degrees of freedom that are not prescribed as Dirichlet conditions. However, we can also keep all the \(\left\{ {c}_j \right\}_{j\in{\mathcal{I}_s}}\) as unknowns, drop the \(u_0\) in the expansion for \(u\), and incorporate all the known \(c_j\) values in the linear system. This has been explained in detail in the 1D case, and the technique is the same for 2D and 3D problems.
Transformation to a reference cell in 2D and 3D¶
The real power of the finite element method first becomes evident when we want to solve partial differential equations posed on two- and three-dimensional domains of non-trivial geometric shape. As in 1D, the domain \(\Omega\) is divided into \(N_e\) non-overlapping cells. The elements have simple shapes: triangles and quadrilaterals are popular in 2D, while tetrahedra and box-shapes elements dominate in 3D. The finite element basis functions \({\varphi}_i\) are, as in 1D, polynomials over each cell. The integrals in the variational formulation are, as in 1D, split into contributions from each cell, and these contributions are calculated by mapping a physical cell, expressed in physical coordinates \(\boldsymbol{x}\), to a reference cell in a local coordinate system \(\boldsymbol{X}\). This mapping will now be explained in detail.
We consider an integral of the type
where the \({\varphi}_i\) functions are finite element basis functions in 2D or 3D, defined in the physical domain. Suppose we want to calculate this integral over a reference cell, denoted by \(\tilde\Omega^r\), in a coordinate system with coordinates \(\boldsymbol{X} = (X_0, X_1)\) (2D) or \(\boldsymbol{X} = (X_0, X_1, X_2)\) (3D). The mapping between a point \(\boldsymbol{X}\) in the reference coordinate system and the corresponding point \(\boldsymbol{x}\) in the physical coordinate system is given by a vector relation \(\boldsymbol{x}(\boldsymbol{X})\). The corresponding Jacobian, \(J\), of this mapping has entries
The change of variables requires \({\, \mathrm{d}x}\) to be replaced by \(\det J{\, \mathrm{d}X}\). The derivatives in the \(\nabla\) operator in the variational form are with respect to \(\boldsymbol{x}\), which we may denote by \(\nabla_{\boldsymbol{x}}\). The \({\varphi}_i(\boldsymbol{x})\) functions in the integral are replaced by local basis functions \({\tilde{\varphi}}_r(\boldsymbol{X})\) so the integral features \(\nabla_{\boldsymbol{x}}{\tilde{\varphi}}_r(\boldsymbol{X})\). We readily have \(\nabla_{\boldsymbol{X}}{\tilde{\varphi}}_r(\boldsymbol{X})\) from formulas for the basis functions in the reference cell, but the desired quantity \(\nabla_{\boldsymbol{x}}{\tilde{\varphi}}_r(\boldsymbol{X})\) requires some efforts to compute. All the details are provided below.
Let \(i=q(e,r)\) and consider two space dimensions. By the chain rule,
and
We can write these two equations as a vector equation
Identifying
we have the relation
which we can solve with respect to \(\nabla_{\boldsymbol{x}}{\varphi}_i\):
On the reference cell, \({\varphi}_i(\boldsymbol{x}) = {\tilde{\varphi}}_r(\boldsymbol{X})\), so
This means that we have the following transformation of the integral in the physical domain to its counterpart over the reference cell:
Numerical integration¶
Integrals are normally computed by numerical integration rules. For multi-dimensional cells, various families of rules exist. All of them are similar to what is shown in 1D: \(\int f {\, \mathrm{d}x}\approx \sum_jw_if(\boldsymbol{x}_j)\), where \(w_j\) are weights and \(\boldsymbol{x}_j\) are corresponding points.
The file numint.py contains the functions
quadrature_for_triangles(n)
and quadrature_for_tetrahedra(n)
,
which returns lists of points and weights corresponding to integration
rules with n
points over the reference triangle
with vertices \((0,0)\), \((1,0)\), \((0,1)\), and the reference tetrahedron
with vertices \((0,0,0)\), \((1,0,0)\), \((0,1,0)\), \((0,0,1)\),
respectively. For example, the first two rules for integration over
a triangle have 1 and 3 points:
>>> import numint
>>> x, w = numint.quadrature_for_triangles(num_points=1)
>>> x
[(0.3333333333333333, 0.3333333333333333)]
>>> w
[0.5]
>>> x, w = numint.quadrature_for_triangles(num_points=3)
>>> x
[(0.16666666666666666, 0.16666666666666666),
(0.66666666666666666, 0.16666666666666666),
(0.16666666666666666, 0.66666666666666666)]
>>> w
[0.16666666666666666, 0.16666666666666666, 0.16666666666666666]
Rules with 1, 3, 4, and 7 points over the triangle will exactly integrate polynomials of degree 1, 2, 3, and 4, respectively. In 3D, rules with 1, 4, 5, and 11 points over the tetrahedron will exactly integrate polynomials of degree 1, 2, 3, and 4, respectively.
Convenient formulas for P1 elements in 2D¶
We shall now provide some formulas for piecewise linear \({\varphi}_i\) functions and their integrals in the physical coordinate system. These formulas make it convenient to compute with P1 elements without the need to work in the reference coordinate system and deal with mappings and Jacobians. A lot of computational and algorithmic details are hidden by this approach.
Let \(\Omega^{(e)}\) be cell number \(e\), and let the three vertices have global vertex numbers \(I\), \(J\), and \(K\). The corresponding coordinates are \((x_{I},y_{I})\), \((x_{J},y_{J})\), and \((x_{K},y_{K})\). The basis function \({\varphi}_I\) over \(\Omega^{(e)}\) have the explicit formula
where
and
The quantity \(\Delta\) is the area of the cell.
The following formula is often convenient when computing element matrices and vectors:
(Note that the \(q\) in this formula is not to be mixed with the \(q(e,r)\) mapping of degrees of freedom.)
As an example, the element matrix entry \(\int_{\Omega^{(e)}} {\varphi}_I{\varphi}_J{\, \mathrm{d}x}\) can be computed by setting \(p=q=1\) and \(r=0\), when \(I\neq J\), yielding \(\Delta/12\), and \(p=2\) and \(q=r=0\), when \(I=J\), resulting in \(\Delta/6\). We collect these numbers in a local element matrix:
The common element matrix entry \(\int_{\Omega^{(e)}} \nabla{\varphi}_I\cdot\nabla{\varphi}_J{\, \mathrm{d}x}\), arising from a Laplace term \(\nabla^2u\), can also easily be computed by the formulas above. We have
so that the element matrix entry becomes \(\frac{1}{4}\Delta^3(\beta_I\beta_J + \gamma_I\gamma_J)\).
From an implementational point of view, one will work with local vertex numbers \(r=0,1,2\), parameterize the coefficients in the basis functions by \(r\), and look up vertex coordinates through \(q(e,r)\).
Similar formulas exist for integration of P1 elements in 3D.
A glimpse of the mathematical theory of the finite element method¶
Almost all books on the finite element method that introduces the abstract variational problem \(a(u,v)=L(v)\) spend considerable pages on deriving error estimates and other properties of the approximate solution. The machinery with function spaces and bilinear and linear forms has the great advantage that a very large class of PDE problems can be analyzed in a unified way. This feature is often taken as an advantage of finite element methods over finite difference and volume methods. Since there are so many excellent textbooks on the mathematical properties of finite element methods [Ref2] [Ref3] [Ref4] [Ref5] [Ref6] [Ref7], this text will not repeat the theory, but give a glimpse of typical assumptions and general results for elliptic PDEs.
Remark. The mathematical theory of finite element methods is primarily developed for to stationary PDE problems of elliptic nature whose solutions are smooth. However, such problems can be solved with the desired accuracy by most numerical methods and pose no difficulties. Time-dependent problems, on the other hand, easily lead to non-physical features in the numerical solutions and therefore requires more care and knowledge by the user. Our focus on the accuracy of the finite element method will of this reason be centered around time-dependent problems, but then we need a different set of tools for the analysis. These tools are based on converting finite element equations to finite difference form and studying Fourier wave components.
Abstract variational forms¶
To list the main results from the mathematical theory of finite elements, we consider linear PDEs with an abstract variational form
This is the discretized problem (as usual in this book) where we seek \(u\in V\). The weak formulation of the corresponding continuous problem, fulfilled by the exact solution \({u_{\small\mbox{e}}}\in\Vex\) is here written as
The space \(V\) is finite dimensional (with dimension \(N+1\)), while \(\Vex\) is infinite dimensional. Normally The hope is that \(u\rightarrow{u_{\small\mbox{e}}}\) as \(N\rightarrow\infty\) and \(V\rightarrow\Vex\).
Example on an abstract variational form and associated spaces¶
Consider the problem \(-u''(x)=f(x)\) on \(\Omega=[0,1]\), with \(u(0)=0\) and \(u'(1)=\beta\). The weak form is
The space \(V\) for the approximate solution \(u\) can be chosen in many ways as previously described. The exact solution \({u_{\small\mbox{e}}}\) fulfills \(a(u,v)=L(v)\) for all \(v\) in \(\Vex\), and to specify what \(\Vex\) is, we need to introduce Hilbert spaces. The Hilbert space \(L^2(\Omega)\) consists of all functions that are square-integrable on \(\Omega\):
The space \(\Vex\) is the space of all functions whose first-order derivative is also square-integrable:
The requirements of square-integrable zeroth- and first-order derivatives are motivated from the formula for \(a(u,v)\) where products of the first-order derivatives are to be integrated on \(\Omega\).
The Sobolev space \(H^1_0(\Omega)\) has an inner product
and associated norm
Assumptions¶
A set of general results builds on the following assumptions. Let \(\Vex\) be an infinite-dimensional inner-product space such that \({u_{\small\mbox{e}}}\in\Vex\). The space has an associated norm \(||v||\) (e.g., \(||v||_{H^1}\) in the example above with \(\Vex=H^1_0(\Omega)\)).
- \(L(v)\) is linear in its argument.
- \(a(u,v)\) is a bilinear in its arguments.
- \(L(v)\) is bounded (also called continuous) if there exists a positive constant \(c_0\) such that \(|L(v)|\leq c_0||v||\) $forall vin Vex$.
- \(a(u,v)\) is bounded (or continuous) if there exists a positive constant \(c_1\) such that \(|a(u,v)|\leq c_1||u|| ||v||\ \forall u,v\in\Vex\).
- \(a(u,v\)) is elliptic (or coercive) if there exists a positive constant \(c_2\) such that \(a(v,v)\geq c_2||v||^2\ \forall v\in\Vex\).
- \(a(u,v)\) is symmetric: \(a(u,v)=a(v,u)\).
Based on the above assumptions, which must be verified in each specific problem, one can derive some general results that are listed below.
Existence and uniqueness¶
There exists a unique solution of the problem find \({u_{\small\mbox{e}}}\in\Vex\) such that
(This result is known as the Lax-Milgram Theorem.)
Stability¶
The solution \({u_{\small\mbox{e}}}\in\Vex\) obeys the stability estimate
Equivalent minimization problem¶
The solution \({u_{\small\mbox{e}}}\in\Vex\) also fulfills the minimization problem
Best approximation principle¶
The energy norm is defined as
The discrete solution \(u\in V\) is the best approximation in energy norm,
This is quite remarkable: once we have \(V\) (i.e., a mesh), the Galerkin method finds the best approximation in this space. In the example above, we have \(||v||_a=\int_0^1 (v')^2dx\), so the derivative \(u'\) is closer to \({u_{\small\mbox{e}}}'\) than any other possible function in \(V\):
Best approximation property in the norm of the space¶
If \(||v||\) is the norm associated with \(\Vex\), we have another best approximation property:
Symmetric, positive definite coefficient matrix¶
The discrete problem \(a(u,v)=L(v)\) $forall vin V$ leads to a linear system \(Ac=b\), where the coefficient matrix \(A\) is symmetric (\(A^T=A\)) and positive definite (\(x^TAx > 0\) for all vectors \(x\neq 0\)). One can then use solution methods that demand less storage and that are faster and more reliable than solvers for general linear systems. One is also guaranteed the existence and uniqueness of the discrete solution \(u\).
Equivalent matrix minimization problem¶
The solution \(c\) of the linear system \(Ac=b\) also solves the minimization problem \(\min_w(\frac{1}{2} w^TAw - b^Tw\) in the vector space \(\mathbb{R}^{N+1}\).
A priori error estimate for the derivative¶
In our sample problem, \(-u''=f\) on \(\Omega=[0,1]\), \(u(0)=0\), \(u'(1)=\beta\), one can derive the following error estimate for Lagrange finite element approximations of degree \(s\):
where \(||u||_{H^{s+1}}\) is a norm that integrates the sum of the square of all derivatives up to order \(s+1\), \(g({u_{\small\mbox{e}}})\) is the integral of \((u'')^2\), \(C\) is a constant, and \(h\) is the maximum cell length. The estimate shows that choosing elements with higher-degree polynomials (large \(s\)) requires more smoothness in \({u_{\small\mbox{e}}}\) since higher-order derivatives need to be square-integrable.
A consequence of the error estimate is that \(u'\rightarrow {u_{\small\mbox{e}}}'\) as \(h\rightarrow 0\), i.e., the approximate solution converges to the exact one.
The constant \(C\) in this and the next estimate depends on the shape of triangles in 2D and tetrahedra in 3D: squeezed elements with a small angle lead to a large \(C\), and such deformed elements are not favorable for the accuracy.
One can generalize the above estimate to the general problem class \(a(u,v)=L(v)\): the error in the derivative is proportional to \(h^s\). Note that the expression \(||{u_{\small\mbox{e}}} - u||\) in the example is \(||{u_{\small\mbox{e}}} - u||_{H^1}\) so it involves the sum of the zeroth and first derivative. The appearance of the derivative makes the error proportional to \(h^s\) - if we only look at the solution it converges as \(h^{s+1}\) (see below).
The above estimate is called an a priori estimate because the bound contains the exact solution, which is not computable. There are also a posteriori estimates where the bound involves the approximation \(u\), which is available in computations.
A priori error estimate for the solution¶
The finite element solution of our sample problem, using P1 elements, fulfills
This estimate shows that the error converges as \(h^2\) for P1 elements. An equivalent finite difference method, see the section Comparison with a finite difference discretization, is known to have an error proportional to \(h^2\), so the above estimate is expected. In general, the convergence is \(h^{s+1}\) for elements with polynomials of degree \(s\). Note that the estimate for \(u'\) is proportional to \(h\) raised to one power less.