Many mathematical models involve \(m+1\) unknown functions governed by a system of \(m+1\) differential equations. In abstract form we may denote the unknowns by \(u^{(0)},\ldots, u^{(m)}\) and write the governing equations as

\[\begin{split}\mathcal{L}_0(u^{(0)},\ldots,u^{(m)}) &= 0,\\ &\vdots\\ \mathcal{L}_{m}(u^{(0)},\ldots,u^{(m)}) &= 0,\end{split}\]

where \(\mathcal{L}_i\) is some differential operator defining differential equation number \(i\).

Variational forms

There are basically two ways of formulating a variational form for a system of differential equations. The first method treats each equation independently as a scalar equation, while the other method views the total system as a vector equation with a vector function as unknown.

Sequence of scalar PDEs formulation

Let us start with the approach that treats one equation at a time. We multiply equation number \(i\) by some test function \(v^{(i)}\in V^{(i)}\) and integrate over the domain:

\[\tag{1} \int_\Omega \mathcal{L}^{(0)}(u^{(0)},\ldots,u^{(m)}) v^{(0)}{\, \mathrm{d}x} = 0,\]
\[\tag{2} \vdots\]
\[\tag{3} \int_\Omega \mathcal{L}^{(m)}(u^{(0)},\ldots,u^{(m)}) v^{(m)}{\, \mathrm{d}x} = 0\]\[ {\thinspace .}\]

Terms with second-order derivatives may be integrated by parts, with Neumann conditions inserted in boundary integrals. Let

\[V^{(i)} = \hbox{span}\{{\psi}_0^{(i)},\ldots,{\psi}_{N_i}^{(i)}\},\]

such that

\[u^{(i)} = B^{(i)}(\boldsymbol{x}) + \sum_{j=0}^{N_i} c_j^{(i)} {\psi}_j^{(i)}(\boldsymbol{x}),\]

where \(B^{(i)}\) is a boundary function to handle nonzero Dirichlet conditions. Observe that different unknowns live in different spaces with different basis functions and numbers of degrees of freedom.

From the \(m\) equations in the variational forms we can derive \(m\) coupled systems of algebraic equations for the \(\Pi_{i=0}^{m} N_i\) unknown coefficients \(c_j^{(i)}\), \(j=0,\ldots,N_i\), \(i=0,\ldots,m\).

Vector PDE formulation

The alternative method for deriving a variational form for a system of differential equations introduces a vector of unknown functions

\[\boldsymbol{u} = (u^{(0)},\ldots,u^{(m)}),\]

a vector of test functions

\[\boldsymbol{v} = (u^{(0)},\ldots,u^{(m)}),\]

with

\[\boldsymbol{u}, \boldsymbol{v} \in \boldsymbol{V} = V^{(0)}\times \cdots \times V^{(m)} {\thinspace .}\]

With nonzero Dirichlet conditions, we have a vector \(\boldsymbol{B} = (B^{(0)},\ldots,B^{(m)})\) with boundary functions and then it is \(\boldsymbol{u} - \boldsymbol{B}\) that lies in \(\boldsymbol{V}\), not \(\boldsymbol{u}\) itself.

The governing system of differential equations is written

\[\boldsymbol{\mathcal{L}}(\boldsymbol{u} ) = 0,\]

where

\[\boldsymbol{\mathcal{L}}(\boldsymbol{u} ) = (\mathcal{L}^{(0)}(\boldsymbol{u}),\ldots, \mathcal{L}^{(m)}(\boldsymbol{u})) {\thinspace .}\]

The variational form is derived by taking the inner product of the vector of equations and the test function vector:

\[\tag{4} \int_\Omega \boldsymbol{\mathcal{L}}(\boldsymbol{u} )\cdot\boldsymbol{v} = 0\quad\forall\boldsymbol{v}\in\boldsymbol{V}{\thinspace .}\]

Observe that (4) is one scalar equation. To derive systems of algebraic equations for the unknown coefficients in the expansions of the unknown functions, one chooses \(m\) linearly independent \(\boldsymbol{v}\) vectors to generate \(m\) independent variational forms from (4). The particular choice \(\boldsymbol{v} = (v^{(0)},0,\ldots,0)\) recovers (1), \(\boldsymbol{v} = (0,\ldots,0,v^{(m)}\) recovers (3), and \(\boldsymbol{v} = (0,\ldots,0,v^{(i)},0,\ldots,0)\) recovers the variational form number \(i\), \(\int_\Omega \mathcal{L}^{(i)} v^{(i)}{\, \mathrm{d}x} =0\), in (1)-(3).

A worked example

We now consider a specific system of two partial differential equations in two space dimensions:

\[\tag{5} \mu \nabla^2 w = -\beta,\]
\[\tag{6} \kappa\nabla^2 T = - \mu ||\nabla w||^2 {\thinspace .}\]

The unknown functions \(w(x,y)\) and \(T(x,y)\) are defined in a domain \(\Omega\), while \(\mu\), \(\beta\), and \(\kappa\) are given constants. The norm in (6) is the standard Euclidean norm:

\[||\nabla w||^2 = \nabla w\cdot\nabla w = w_x^2 + w_y^2 {\thinspace .}\]

The boundary conditions associated with (5)-(6) are \(w=0\) on \(\partial\Omega\) and \(T=T_0\) on \(\partial\Omega\). Each of the equations (5) and (6) needs one condition at each point on the boundary.

The system (5)-(6) arises from fluid flow in a straight pipe, with the \(z\) axis in the direction of the pipe. The domain \(\Omega\) is a cross section of the pipe, \(w\) is the velocity in the \(z\) direction, \(\mu\) is the viscosity of the fluid, \(\beta\) is the pressure gradient along the pipe, \(T\) is the temperature, and \(\kappa\) is the heat conduction coefficient of the fluid. The equation (5) comes from the Navier-Stokes equations, and (6) follows from the energy equation. The term \(- \mu ||\nabla w||^2\) models heating of the fluid due to internal friction.

Observe that the system (5)-(6) has only a one-way coupling: \(T\) depends on \(w\), but \(w\) does not depend on \(T\), because we can solve (5) with respect to \(w\) and then (6) with respect to \(T\). Some may argue that this is not a real system of PDEs, but just two scalar PDEs. Nevertheless, the one-way coupling is convenient when comparing different variational forms and different implementations.

Identical function spaces for the unknowns

Let us first apply the same function space \(V\) for \(w\) and \(T\) (or more precisely, \(w\in V\) and \(T-T_0 \in V\)). With

\[V = \hbox{span}\{{\psi}_0(x,y),\ldots,{\psi}_N(x,y)\},\]

we write

\[\tag{7} w = \sum_{j=0}^N c^{(w)}_j {\psi}_j,\quad T = T_0 + \sum_{j=0}^N c^{(T)}_j {\psi}_j{\thinspace .}\]

Note that \(w\) and \(T\) in (5)-(6) denote the exact solution of the PDEs, while \(w\) and \(T\) (7) are the discrete functions that approximate the exact solution. It should be clear from the context whether a symbol means the exact or approximate solution, but when we need both at the same time, we use a subscript e to denote the exact solution.

Variational form of each individual PDE

Inserting the expansions (7) in the governing PDEs, results in a residual in each equation,

\[\tag{8} R_w = \mu \nabla^2 w + \beta,\]
\[\tag{9} R_T = \kappa\nabla^2 T + \mu ||\nabla w||^2 {\thinspace .}\]

A Galerkin method demands \(R_w\) and \(R_T\) do be orthogonal to \(V\):

\[\begin{split}\int_\Omega R_w v {\, \mathrm{d}x} &=0\quad\forall v\in V,\\ \int_\Omega R_T v {\, \mathrm{d}x} &=0\quad\forall v\in V {\thinspace .}\end{split}\]

Because of the Dirichlet conditions, \(v=0\) on \(\partial\Omega\). We integrate the Laplace terms by parts and note that the boundary terms vanish since \(v=0\) on \(\partial\Omega\):

\[\tag{10} \int_\Omega \mu \nabla w\cdot\nabla v {\, \mathrm{d}x} = \int_\Omega \beta v{\, \mathrm{d}x} \quad\forall v\in V,\]
\[\tag{11} \int_\Omega \kappa \nabla T\cdot\nabla v {\, \mathrm{d}x} = \int_\Omega \mu \nabla w\cdot\nabla w\, v{\, \mathrm{d}x} \quad\forall v\in V\]\[ {\thinspace .}\]

Compound scalar variational form

The alternative way of deriving the variational from is to introduce a test vector function \(\boldsymbol{v}\in\boldsymbol{V} = V\times V\) and take the inner product of \(\boldsymbol{v}\) and the residuals, integrated over the domain:

\[\int_{\Omega} (R_w, R_T)\cdot\boldsymbol{v} {\, \mathrm{d}x} = 0\quad\forall\boldsymbol{v}\in\boldsymbol{V} {\thinspace .}\]

With \(\boldsymbol{v} = (v_0,v_1)\) we get

\[\int_{\Omega} (R_w v_0 + R_T v_1) {\, \mathrm{d}x} = 0\quad\forall\boldsymbol{v}\in\boldsymbol{V} {\thinspace .}\]

Integrating the Laplace terms by parts results in

\[\tag{12} \int_\Omega (\mu\nabla w\cdot\nabla v_0 + \kappa\nabla T\cdot\nabla v_1){\, \mathrm{d}x} = \int_\Omega (\beta v_0 + \mu\nabla w\cdot\nabla w\, v_1){\, \mathrm{d}x}, \quad\forall \boldsymbol{v}\in\boldsymbol{V} {\thinspace .}\]

Choosing \(v_0=v\) and \(v_1=0\) gives the variational form (10), while \(v_0=0\) and \(v_1=v\) gives (11).

With the inner product notation, \((p,q) = \int_\Omega pq{\, \mathrm{d}x}\), we can alternatively write (10) and (11) as

\[\begin{split} (\mu\nabla w,\nabla v) &= (\beta, v) \quad\forall v\in V,\\ (\kappa \nabla T,\nabla v) &= (\mu\nabla w\cdot\nabla w, v)\quad\forall v\in V,\end{split}\]

or since \(\mu\) and \(\kappa\) are considered constant,

\[\tag{13} \mu (\nabla w,\nabla v) = (\beta, v) \quad\forall v\in V,\]
\[\tag{14} \kappa(\nabla T,\nabla v) = \mu(\nabla w\cdot\nabla w, v)\quad\forall v\in V\]\[ {\thinspace .}\]

Decoupled linear systems

The linear systems governing the coefficients \(c_j^{(w)}\) and \(c_j^{(T)}\), \(j=0,\ldots,N\), are derived by inserting the expansions (7) in (10) and (11), and choosing \(v={\psi}_i\) for \(i=0,\ldots,N\). The result becomes

\[\tag{15} \sum_{j=0}^N A^{(w)}_{i,j} c^{(w)}_j = b_i^{(w)},\quad i=0,\ldots,N,\]
\[\tag{16} \sum_{j=0}^N A^{(T)}_{i,j} c^{(T)}_j = b_i^{(T)},\quad i=0,\ldots,N,\]
\[\tag{17} A^{(w)}_{i,j} = \mu(\nabla {\psi}_j,\nabla {\psi}_i),\]
\[\tag{18} b_i^{(w)} = (\beta, {\psi}_i),\]
\[\tag{19} A^{(T)}_{i,j} = \kappa(\nabla {\psi}_j,\nabla {\psi}_i),\]
\[\tag{20} b_i^{(T)} = \mu((\sum_j c^{(w)}_j\nabla{\psi}_j)\cdot (\sum_k c^{(w)}_k\nabla{\psi}_k), {\psi}_i) {\thinspace .}\]

It can also be instructive to write the linear systems using matrices and vectors. Define \(K\) as the matrix corresponding to the Laplace operator \(\nabla^2\). That is, \(K_{i,j} = (\nabla {\psi}_j,\nabla {\psi}_i)\). Let us introduce the vectors

\[\begin{split}b^{(w)} &= (b_0^{(w)},\ldots,b_{N}^{(w)}),\\ b^{(T)} &= (b_0^{(T)},\ldots,b_{N}^{(T)}),\\ c^{(w)} &= (c_0^{(w)},\ldots,c_{N}^{(w)}),\\ c^{(T)} &= (c_0^{(T)},\ldots,c_{N}^{(T)}){\thinspace .}\end{split}\]

The system (15)-(16) can now be expressed in matrix-vector form as

\[\tag{21} \mu K c^{(w)} = b^{(w)},\]
\[\tag{22} \kappa K c^{(T)} = b^{(T)}{\thinspace .}\]

We can solve the first system for \(c^{(w)}\), and then the right-hand side \(b^{(T)}\) is known such that we can solve the second system for \(c^{(T)}\).

Coupled linear systems

Despite the fact that \(w\) can be computed first, without knowing \(T\), we shall now pretend that \(w\) and \(T\) enter a two-way coupling such that we need to derive the algebraic equations as one system for all the unknowns \(c_j^{(w)}\) and \(c_j^{(T)}\), \(j=0,\ldots,N\). This system is nonlinear in \(c_j^{(w)}\) because of the \(\nabla w\cdot\nabla w\) product. To remove this nonlinearity, imagine that we introduce an iteration method where we replace \(\nabla w\cdot\nabla w\) by \(\nabla w_{-}\cdot\nabla w\), \(w_{-}\) being the \(w\) computed in the previous iteration. Then the term \(\nabla w_{-}\cdot\nabla w\) is linear in \(w\) since \(w_{-}\) is known. The total linear system becomes

\[\tag{23} \sum_{j=0}^N A^{(w,w)}_{i,j} c^{(w)}_j + \sum_{j=0}^N A^{(w,T)}_{i,j} c^{(T)}_j = b_i^{(w)},\quad i=0,\ldots,N,\]
\[\tag{24} \sum_{j=0}^N A^{(T,w)}_{i,j} c^{(w)}_j + \sum_{j=0}^N A^{(T,T)}_{i,j} c^{(T)}_j = b_i^{(T)},\quad i=0,\ldots,N,\]
\[\tag{25} A^{(w,w)}_{i,j} = \mu(\nabla {\psi}_j,{\psi}_i),\]
\[\tag{26} A^{(w,T)}_{i,j} = 0,\]
\[\tag{27} b_i^{(w)} = (\beta, {\psi}_i),\]
\[\tag{28} A^{(w,T)}_{i,j} = \mu((\nabla{\psi} w_{-})\cdot\nabla{\psi}_j), {\psi}_i),\]
\[\tag{29} A^{(T,T)}_{i,j} = \kappa(\nabla {\psi}_j,{\psi}_i),\]
\[\tag{30} b_i^{(T)} = 0 {\thinspace .}\]

This system can alternatively be written in matrix-vector form as

\[\tag{31} \mu K c^{(w)} = b^{(w)},\]
\[\tag{32} L c^{(w)} + \kappa K c^{(T)} =0,\]

with \(L\) as the matrix from the \(\nabla w_{-}\cdot\nabla\) operator: \(L_{i,j} = A^{(w,T)}_{i,j}\).

The matrix-vector equations are often conveniently written in block form:

\[\begin{split}\left(\begin{array}{cc} \mu K & 0\\ L & \kappa K \end{array}\right) \left(\begin{array}{c} c^{(w)}\\ c^{(T)} \end{array}\right) = \left(\begin{array}{c} b^{(w)}\\ 0 \end{array}\right),\end{split}\]

Note that in the general case where all unknowns enter all equations, we have to solve the compound system (23)-(24) since then we cannot utilize the special property that (15) does not involve \(T\) and can be solved first.

When the viscosity depends on the temperature, the \(\mu\nabla^2w\) term must be replaced by \(\nabla\cdot (\mu(T)\nabla w)\), and then \(T\) enters the equation for \(w\). Now we have a two-way coupling since both equations contain \(w\) and \(T\) and therefore must be solved simultaneously Th equation \(\nabla\cdot (\mu(T)\nabla w)=-\beta\) is nonlinear, and if some iteration procedure is invoked, where we use a previously computed \(T_{-}\) in the viscosity (\(\mu(T_{-})\)), the coefficient is known, and the equation involves only one unknown, \(w\). In that case we are back to the one-way coupled set of PDEs.

We may also formulate our PDE system as a vector equation. To this end, we introduce the vector of unknowns \(\boldsymbol{u} = (u^{(0)},u^{(1)})\), where \(u^{(0)}=w\) and \(u^{(1)}=T\). We then have

\[\begin{split}\nabla^2 \boldsymbol{u} = \left(\begin{array}{cc} -{\mu}^{-1}{\beta}\\ -{\kappa}^{-1}\mu \nabla u^{(0)}\cdot\nabla u^{(0)} \end{array}\right){\thinspace .}\end{split}\]

Different function spaces for the unknowns

It is easy to generalize the previous formulation to the case where \(w\in V^{(w)}\) and \(T\in V^{(T)}\), where \(V^{(w)}\) and \(V^{(T)}\) can be different spaces with different numbers of degrees of freedom. For example, we may use quadratic basis functions for \(w\) and linear for \(T\). Approximation of the unknowns by different finite element spaces is known as mixed finite element methods.

We write

\[\begin{split}V^{(w)} &= \hbox{span}\{{\psi}_0^{(w)},\ldots,{\psi}_{N_w}^{(w)}\},\\ V^{(T)} &= \hbox{span}\{{\psi}_0^{(T)},\ldots,{\psi}_{N_T}^{(T)}\} {\thinspace .}\end{split}\]

The next step is to multiply (5) by a test function \(v^{(w)}\in V^{(w)}\) and (6) by a \(v^{(T)}\in V^{(T)}\), integrate by parts and arrive at

\[\tag{33} \int_\Omega \mu \nabla w\cdot\nabla v^{(w)} {\, \mathrm{d}x} = \int_\Omega \beta v^{(w)}{\, \mathrm{d}x} \quad\forall v^{(w)}\in V^{(w)},\]
\[\tag{34} \int_\Omega \kappa \nabla T\cdot\nabla v^{(T)} {\, \mathrm{d}x} = \int_\Omega \mu \nabla w\cdot\nabla w\, v^{(T)}{\, \mathrm{d}x} \quad\forall v^{(T)}\in V^{(T)}\]\[ {\thinspace .}\]

The compound scalar variational formulation applies a test vector function \(\boldsymbol{v} = (v^{(w)}, v^{(T)})\) and reads

\[\tag{35} \int_\Omega (\mu\nabla w\cdot\nabla v^{(w)} + \kappa\nabla T\cdot\nabla v^{(T)}){\, \mathrm{d}x} = \int_\Omega (\beta v^{(w)} + \mu\nabla w\cdot\nabla w\, v^{(T)}){\, \mathrm{d}x},\]

valid \(\forall \boldsymbol{v}\in\boldsymbol{V} = V^{(w)}\times V^{(T)}\).

The associated linear system is similar to (15)-(16) or (23)-(24), except that we need to distinguish between \({\psi}_i^{(w)}\) and \({\psi}_i^{(T)}\), and the range in the sums over \(j\) must match the number of degrees of freedom in the spaces \(V^{(w)}\) and \(V^{(T)}\). The formulas become

\[\tag{36} \sum_{j=0}^{N_w} A^{(w)}_{i,j} c^{(w)}_j = b_i^{(w)},\quad i=0,\ldots,N_w,\]
\[\tag{37} \sum_{j=0}^{N_T} A^{(T)}_{i,j} c^{(T)}_j = b_i^{(T)},\quad i=0,\ldots,N_T,\]
\[\tag{38} A^{(w)}_{i,j} = \mu(\nabla {\psi}_j^{(w)},{\psi}_i^{(w)}),\]
\[\tag{39} b_i^{(w)} = (\beta, {\psi}_i^{(w)}),\]
\[\tag{40} A^{(T)}_{i,j} = \kappa(\nabla {\psi}_j^{(T)},{\psi}_i^{(T)}),\]
\[\tag{41} b_i^{(T)} = \mu(\nabla w_{-}, {\psi}_i^{(T)}) {\thinspace .}\]

In the case we formulate one compound linear system involving both \(c^{(w)}_j\), \(j=0,\ldots,N_w\), and \(c^{(T)}_j\), \(j=0,\ldots,N_T\), (23)-(24) becomes

\[\tag{42} \sum_{j=0}^{N_w} A^{(w,w)}_{i,j} c^{(w)}_j + \sum_{j=0}^{N_T} A^{(w,T)}_{i,j} c^{(T)}_j = b_i^{(w)},\quad i=0,\ldots,N_w,\]
\[\tag{43} \sum_{j=0}^{N_w} A^{(T,w)}_{i,j} c^{(w)}_j + \sum_{j=0}^{N_T} A^{(T,T)}_{i,j} c^{(T)}_j = b_i^{(T)},\quad i=0,\ldots,N_T,\]
\[\tag{44} A^{(w,w)}_{i,j} = \mu(\nabla {\psi}_j^{(w)},{\psi}_i^{(w)}),\]
\[\tag{45} A^{(w,T)}_{i,j} = 0,\]
\[\tag{46} b_i^{(w)} = (\beta, {\psi}_i^{(w)}),\]
\[\tag{47} A^{(w,T)}_{i,j} = \mu (\nabla w_{-}\cdot\nabla{\psi}_j^{(w)}), {\psi}_i^{(T)}),\]
\[\tag{48} A^{(T,T)}_{i,j} = \kappa(\nabla {\psi}_j^{(T)},{\psi}_i^{(T)}),\]
\[\tag{49} b_i^{(T)} = 0 {\thinspace .}\]

The corresponding block form

\[\begin{split}\left(\begin{array}{cc} \mu K^{(w)} & 0\\ L & \kappa K^{(T)} \end{array}\right) \left(\begin{array}{c} c^{(w)}\\ c^{(T)} \end{array}\right) = \left(\begin{array}{c} b^{(w)}\\ 0 \end{array}\right),\end{split}\]

has square and rectangular block matrices: \(K^{(w)}\) is \(N_w\times N_w\), \(K^{(T)}\) is \(N_T\times N_T\), while \(L\) is \(N_T\times N_w\),

Computations in 1D

We can reduce the system (5)-(6) to one space dimension, which corresponds to flow in a channel between two flat plates. Alternatively, one may consider flow in a circular pipe, introduce cylindrical coordinates, and utilize the radial symmetry to reduce the equations to a one-dimensional problem in the radial coordinate. The former model becomes

\[\tag{50} \mu w_{xx} = -\beta,\]
\[\tag{51} \kappa T_{xx} = - \mu w_x^2,\]

while the model in the radial coordinate \(r\) reads

\[\tag{52} \mu\frac{1}{r}\frac{d}{dr}\left( r\frac{dw}{dr}\right) = -\beta,\]
\[\tag{53} \kappa \frac{1}{r}\frac{d}{dr}\left( r\frac{dT}{dr}\right) = - \mu \left( \frac{dw}{dr}\right)^2 {\thinspace .}\]

The domain for (50)-(51) is \(\Omega = [0,H]\), with boundary conditions \(w(0)=w(H)=0\) and \(T(0)=T(H)=T_0\). For (52)-(53) the domain is \([0,R]\) (\(R\) being the radius of the pipe) and the boundary conditions are \(du/dr = dT/dr =0\) for \(r=0\), \(u(R)=0\), and \(T(R)=T_0\).

Calculations to be continued...