Many mathematical models involve $ m+1 $ unknown functions governed by a system of $ m+1 $ differential equations. In abstract form we may denote the unknowns by $ u^{(0)},\ldots, u^{(m)} $ and write the governing equations as $$ \begin{align*} \mathcal{L}_0(u^{(0)},\ldots,u^{(m)}) &= 0,\\ &\vdots\\ \mathcal{L}_{m}(u^{(0)},\ldots,u^{(m)}) &= 0, \end{align*} $$ where $ \mathcal{L}_i $ is some differential operator defining differential equation number $ i $.

Variational forms

There are basically two ways of formulating a variational form for a system of differential equations. The first method treats each equation independently as a scalar equation, while the other method views the total system as a vector equation with a vector function as unknown.

Sequence of scalar PDEs formulation

Let us start with the approach that treats one equation at a time. We multiply equation number $ i $ by some test function $ v^{(i)}\in V^{(i)} $ and integrate over the domain: $$ \begin{align} \int_\Omega \mathcal{L}^{(0)}(u^{(0)},\ldots,u^{(m)}) v^{(0)}\dx &= 0, \tag{1}\\ &\vdots \tag{2}\\ \int_\Omega \mathcal{L}^{(m)}(u^{(0)},\ldots,u^{(m)}) v^{(m)}\dx &= 0 \tag{3} \tp \end{align} $$ Terms with second-order derivatives may be integrated by parts, with Neumann conditions inserted in boundary integrals. Let $$ V^{(i)} = \hbox{span}\{\baspsi_0^{(i)},\ldots,\baspsi_{N_i}^{(i)}\},$$ such that $$ u^{(i)} = B^{(i)}(\x) + \sum_{j=0}^{N_i} c_j^{(i)} \baspsi_j^{(i)}(\x), $$ where $ B^{(i)} $ is a boundary function to handle nonzero Dirichlet conditions. Observe that different unknowns live in different spaces with different basis functions and numbers of degrees of freedom.

From the $ m $ equations in the variational forms we can derive $ m $ coupled systems of algebraic equations for the $ \Pi_{i=0}^{m} N_i $ unknown coefficients $ c_j^{(i)} $, $ j=0,\ldots,N_i $, $ i=0,\ldots,m $.

Vector PDE formulation

The alternative method for deriving a variational form for a system of differential equations introduces a vector of unknown functions $$ \u = (u^{(0)},\ldots,u^{(m)}),$$ a vector of test functions $$ \v = (u^{(0)},\ldots,u^{(m)}),$$ with $$ \u, \v \in \V = V^{(0)}\times \cdots \times V^{(m)} \tp $$ With nonzero Dirichlet conditions, we have a vector $ \boldsymbol{B} = (B^{(0)},\ldots,B^{(m)}) $ with boundary functions and then it is $ \u - \boldsymbol{B} $ that lies in $ \V $, not $ \u $ itself.

The governing system of differential equations is written $$ \boldsymbol{\mathcal{L}}(\u ) = 0,$$ where $$ \boldsymbol{\mathcal{L}}(\u ) = (\mathcal{L}^{(0)}(\u),\ldots, \mathcal{L}^{(m)}(\u)) \tp $$ The variational form is derived by taking the inner product of the vector of equations and the test function vector: $$ \begin{equation} \int_\Omega \boldsymbol{\mathcal{L}}(\u )\cdot\v = 0\quad\forall\v\in\V\tp \tag{4} \end{equation} $$

Observe that (4) is one scalar equation. To derive systems of algebraic equations for the unknown coefficients in the expansions of the unknown functions, one chooses $ m $ linearly independent $ \v $ vectors to generate $ m $ independent variational forms from (4). The particular choice $ \v = (v^{(0)},0,\ldots,0) $ recovers (1), $ \v = (0,\ldots,0,v^{(m)} $ recovers (3), and $ \v = (0,\ldots,0,v^{(i)},0,\ldots,0) $ recovers the variational form number $ i $, $ \int_\Omega \mathcal{L}^{(i)} v^{(i)}\dx =0 $, in (1)-(3).

A worked example

We now consider a specific system of two partial differential equations in two space dimensions: $$ \begin{align} \mu \nabla^2 w &= -\beta, \tag{5}\\ \kappa\nabla^2 T &= - \mu ||\nabla w||^2 \tp \tag{6} \end{align} $$ The unknown functions $ w(x,y) $ and $ T(x,y) $ are defined in a domain $ \Omega $, while $ \mu $, $ \beta $, and $ \kappa $ are given constants. The norm in (6) is the standard Euclidean norm: $$ ||\nabla w||^2 = \nabla w\cdot\nabla w = w_x^2 + w_y^2 \tp $$

The boundary conditions associated with (5)-(6) are $ w=0 $ on $ \partial\Omega $ and $ T=T_0 $ on $ \partial\Omega $. Each of the equations (5) and (6) needs one condition at each point on the boundary.

The system (5)-(6) arises from fluid flow in a straight pipe, with the $ z $ axis in the direction of the pipe. The domain $ \Omega $ is a cross section of the pipe, $ w $ is the velocity in the $ z $ direction, $ \mu $ is the viscosity of the fluid, $ \beta $ is the pressure gradient along the pipe, $ T $ is the temperature, and $ \kappa $ is the heat conduction coefficient of the fluid. The equation (5) comes from the Navier-Stokes equations, and (6) follows from the energy equation. The term $ - \mu ||\nabla w||^2 $ models heating of the fluid due to internal friction.

Observe that the system (5)-(6) has only a one-way coupling: $ T $ depends on $ w $, but $ w $ does not depend on $ T $, because we can solve (5) with respect to $ w $ and then (6) with respect to $ T $. Some may argue that this is not a real system of PDEs, but just two scalar PDEs. Nevertheless, the one-way coupling is convenient when comparing different variational forms and different implementations.

Identical function spaces for the unknowns

Let us first apply the same function space $ V $ for $ w $ and $ T $ (or more precisely, $ w\in V $ and $ T-T_0 \in V $). With $$ V = \hbox{span}\{\baspsi_0(x,y),\ldots,\baspsi_N(x,y)\}, $$ we write $$ \begin{equation} w = \sum_{j=0}^N c^{(w)}_j \baspsi_j,\quad T = T_0 + \sum_{j=0}^N c^{(T)}_j \baspsi_j\tp \tag{7} \end{equation} $$ Note that $ w $ and $ T $ in (5)-(6) denote the exact solution of the PDEs, while $ w $ and $ T $ (7) are the discrete functions that approximate the exact solution. It should be clear from the context whether a symbol means the exact or approximate solution, but when we need both at the same time, we use a subscript e to denote the exact solution.

Variational form of each individual PDE

Inserting the expansions (7) in the governing PDEs, results in a residual in each equation, $$ \begin{align} R_w &= \mu \nabla^2 w + \beta, \tag{8}\\ R_T &= \kappa\nabla^2 T + \mu ||\nabla w||^2 \tp \tag{9} \end{align} $$ A Galerkin method demands $ R_w $ and $ R_T $ do be orthogonal to $ V $: $$ \begin{align*} \int_\Omega R_w v \dx &=0\quad\forall v\in V,\\ \int_\Omega R_T v \dx &=0\quad\forall v\in V \tp \end{align*} $$ Because of the Dirichlet conditions, $ v=0 $ on $ \partial\Omega $. We integrate the Laplace terms by parts and note that the boundary terms vanish since $ v=0 $ on $ \partial\Omega $: $$ \begin{align} \int_\Omega \mu \nabla w\cdot\nabla v \dx &= \int_\Omega \beta v\dx \quad\forall v\in V, \tag{10}\\ \int_\Omega \kappa \nabla T\cdot\nabla v \dx &= \int_\Omega \mu \nabla w\cdot\nabla w\, v\dx \quad\forall v\in V \tag{11} \tp \end{align} $$

Compound scalar variational form

The alternative way of deriving the variational from is to introduce a test vector function $ \v\in\V = V\times V $ and take the inner product of $ \v $ and the residuals, integrated over the domain: $$ \int_{\Omega} (R_w, R_T)\cdot\v \dx = 0\quad\forall\v\in\V \tp $$ With $ \v = (v_0,v_1) $ we get $$ \int_{\Omega} (R_w v_0 + R_T v_1) \dx = 0\quad\forall\v\in\V \tp $$ Integrating the Laplace terms by parts results in $$ \begin{equation} \int_\Omega (\mu\nabla w\cdot\nabla v_0 + \kappa\nabla T\cdot\nabla v_1)\dx = \int_\Omega (\beta v_0 + \mu\nabla w\cdot\nabla w\, v_1)\dx, \quad\forall \v\in\V \tp \tag{12} \end{equation} $$ Choosing $ v_0=v $ and $ v_1=0 $ gives the variational form (10), while $ v_0=0 $ and $ v_1=v $ gives (11).

With the inner product notation, $ (p,q) = \int_\Omega pq\dx $, we can alternatively write (10) and (11) as $$ \begin{align*} (\mu\nabla w,\nabla v) &= (\beta, v) \quad\forall v\in V,\\ (\kappa \nabla T,\nabla v) &= (\mu\nabla w\cdot\nabla w, v)\quad\forall v\in V, \end{align*} $$ or since $ \mu $ and $ \kappa $ are considered constant, $$ \begin{align} \mu (\nabla w,\nabla v) &= (\beta, v) \quad\forall v\in V, \tag{13}\\ \kappa(\nabla T,\nabla v) &= \mu(\nabla w\cdot\nabla w, v)\quad\forall v\in V \tag{14} \tp \end{align} $$

Decoupled linear systems

The linear systems governing the coefficients $ c_j^{(w)} $ and $ c_j^{(T)} $, $ j=0,\ldots,N $, are derived by inserting the expansions (7) in (10) and (11), and choosing $ v=\baspsi_i $ for $ i=0,\ldots,N $. The result becomes $$ \begin{align} \sum_{j=0}^N A^{(w)}_{i,j} c^{(w)}_j &= b_i^{(w)},\quad i=0,\ldots,N, \tag{15}\\ \sum_{j=0}^N A^{(T)}_{i,j} c^{(T)}_j &= b_i^{(T)},\quad i=0,\ldots,N, \tag{16}\\ A^{(w)}_{i,j} &= \mu(\nabla \baspsi_j,\nabla \baspsi_i), \tag{17}\\ b_i^{(w)} &= (\beta, \baspsi_i), \tag{18}\\ A^{(T)}_{i,j} &= \kappa(\nabla \baspsi_j,\nabla \baspsi_i), \tag{19}\\ b_i^{(T)} &= \mu((\sum_j c^{(w)}_j\nabla\baspsi_j)\cdot (\sum_k c^{(w)}_k\nabla\baspsi_k), \baspsi_i) \tp \tag{20} \end{align} $$

It can also be instructive to write the linear systems using matrices and vectors. Define $ K $ as the matrix corresponding to the Laplace operator $ \nabla^2 $. That is, $ K_{i,j} = (\nabla \baspsi_j,\nabla \baspsi_i) $. Let us introduce the vectors $$ \begin{align*} b^{(w)} &= (b_0^{(w)},\ldots,b_{N}^{(w)}),\\ b^{(T)} &= (b_0^{(T)},\ldots,b_{N}^{(T)}),\\ c^{(w)} &= (c_0^{(w)},\ldots,c_{N}^{(w)}),\\ c^{(T)} &= (c_0^{(T)},\ldots,c_{N}^{(T)})\tp \end{align*} $$ The system (15)-(16) can now be expressed in matrix-vector form as $$ \begin{align} \mu K c^{(w)} &= b^{(w)}, \tag{21}\\ \kappa K c^{(T)} &= b^{(T)}\tp \tag{22} \end{align} $$

We can solve the first system for $ c^{(w)} $, and then the right-hand side $ b^{(T)} $ is known such that we can solve the second system for $ c^{(T)} $.

Coupled linear systems

Despite the fact that $ w $ can be computed first, without knowing $ T $, we shall now pretend that $ w $ and $ T $ enter a two-way coupling such that we need to derive the algebraic equations as one system for all the unknowns $ c_j^{(w)} $ and $ c_j^{(T)} $, $ j=0,\ldots,N $. This system is nonlinear in $ c_j^{(w)} $ because of the $ \nabla w\cdot\nabla w $ product. To remove this nonlinearity, imagine that we introduce an iteration method where we replace $ \nabla w\cdot\nabla w $ by $ \nabla w_{-}\cdot\nabla w $, $ w_{-} $ being the $ w $ computed in the previous iteration. Then the term $ \nabla w_{-}\cdot\nabla w $ is linear in $ w $ since $ w_{-} $ is known. The total linear system becomes $$ \begin{align} \sum_{j=0}^N A^{(w,w)}_{i,j} c^{(w)}_j + \sum_{j=0}^N A^{(w,T)}_{i,j} c^{(T)}_j &= b_i^{(w)},\quad i=0,\ldots,N, \tag{23}\\ \sum_{j=0}^N A^{(T,w)}_{i,j} c^{(w)}_j + \sum_{j=0}^N A^{(T,T)}_{i,j} c^{(T)}_j &= b_i^{(T)},\quad i=0,\ldots,N, \tag{24}\\ A^{(w,w)}_{i,j} &= \mu(\nabla \baspsi_j,\baspsi_i), \tag{25}\\ A^{(w,T)}_{i,j} &= 0, \tag{26}\\ b_i^{(w)} &= (\beta, \baspsi_i), \tag{27}\\ A^{(w,T)}_{i,j} &= \mu((\nabla\baspsi w_{-})\cdot\nabla\baspsi_j), \baspsi_i), \tag{28}\\ A^{(T,T)}_{i,j} &= \kappa(\nabla \baspsi_j,\baspsi_i), \tag{29}\\ b_i^{(T)} &= 0 \tp \tag{30} \end{align} $$ This system can alternatively be written in matrix-vector form as $$ \begin{align} \mu K c^{(w)} &= b^{(w)}, \tag{31}\\ L c^{(w)} + \kappa K c^{(T)} & =0, \tag{32} \end{align} $$ with $ L $ as the matrix from the $ \nabla w_{-}\cdot\nabla $ operator: $ L_{i,j} = A^{(w,T)}_{i,j} $.

The matrix-vector equations are often conveniently written in block form: $$ \left(\begin{array}{cc} \mu K & 0\\ L & \kappa K \end{array}\right) \left(\begin{array}{c} c^{(w)}\\ c^{(T)} \end{array}\right) = \left(\begin{array}{c} b^{(w)}\\ 0 \end{array}\right), $$

Note that in the general case where all unknowns enter all equations, we have to solve the compound system (23)-(24) since then we cannot utilize the special property that (15) does not involve $ T $ and can be solved first.

When the viscosity depends on the temperature, the $ \mu\nabla^2w $ term must be replaced by $ \nabla\cdot (\mu(T)\nabla w) $, and then $ T $ enters the equation for $ w $. Now we have a two-way coupling since both equations contain $ w $ and $ T $ and therefore must be solved simultaneously Th equation $ \nabla\cdot (\mu(T)\nabla w)=-\beta $ is nonlinear, and if some iteration procedure is invoked, where we use a previously computed $ T_{-} $ in the viscosity ($ \mu(T_{-}) $), the coefficient is known, and the equation involves only one unknown, $ w $. In that case we are back to the one-way coupled set of PDEs.

We may also formulate our PDE system as a vector equation. To this end, we introduce the vector of unknowns $ \u = (u^{(0)},u^{(1)}) $, where $ u^{(0)}=w $ and $ u^{(1)}=T $. We then have $$ \nabla^2 \u = \left(\begin{array}{cc} -{\mu}^{-1}{\beta}\\ -{\kappa}^{-1}\mu \nabla u^{(0)}\cdot\nabla u^{(0)} \end{array}\right)\tp $$

Different function spaces for the unknowns

It is easy to generalize the previous formulation to the case where $ w\in V^{(w)} $ and $ T\in V^{(T)} $, where $ V^{(w)} $ and $ V^{(T)} $ can be different spaces with different numbers of degrees of freedom. For example, we may use quadratic basis functions for $ w $ and linear for $ T $. Approximation of the unknowns by different finite element spaces is known as mixed finite element methods.

We write $$ \begin{align*} V^{(w)} &= \hbox{span}\{\baspsi_0^{(w)},\ldots,\baspsi_{N_w}^{(w)}\},\\ V^{(T)} &= \hbox{span}\{\baspsi_0^{(T)},\ldots,\baspsi_{N_T}^{(T)}\} \tp \end{align*} $$ The next step is to multiply (5) by a test function $ v^{(w)}\in V^{(w)} $ and (6) by a $ v^{(T)}\in V^{(T)} $, integrate by parts and arrive at $$ \begin{align} \int_\Omega \mu \nabla w\cdot\nabla v^{(w)} \dx &= \int_\Omega \beta v^{(w)}\dx \quad\forall v^{(w)}\in V^{(w)}, \tag{33}\\ \int_\Omega \kappa \nabla T\cdot\nabla v^{(T)} \dx &= \int_\Omega \mu \nabla w\cdot\nabla w\, v^{(T)}\dx \quad\forall v^{(T)}\in V^{(T)} \tag{34} \tp \end{align} $$

The compound scalar variational formulation applies a test vector function $ \v = (v^{(w)}, v^{(T)}) $ and reads $$ \begin{equation} \int_\Omega (\mu\nabla w\cdot\nabla v^{(w)} + \kappa\nabla T\cdot\nabla v^{(T)})\dx = \int_\Omega (\beta v^{(w)} + \mu\nabla w\cdot\nabla w\, v^{(T)})\dx, \tag{35} \end{equation} $$ valid $ \forall \v\in\V = V^{(w)}\times V^{(T)} $.

The associated linear system is similar to (15)-(16) or (23)-(24), except that we need to distinguish between $ \baspsi_i^{(w)} $ and $ \baspsi_i^{(T)} $, and the range in the sums over $ j $ must match the number of degrees of freedom in the spaces $ V^{(w)} $ and $ V^{(T)} $. The formulas become $$ \begin{align} \sum_{j=0}^{N_w} A^{(w)}_{i,j} c^{(w)}_j &= b_i^{(w)},\quad i=0,\ldots,N_w, \tag{36}\\ \sum_{j=0}^{N_T} A^{(T)}_{i,j} c^{(T)}_j &= b_i^{(T)},\quad i=0,\ldots,N_T, \tag{37}\\ A^{(w)}_{i,j} &= \mu(\nabla \baspsi_j^{(w)},\baspsi_i^{(w)}), \tag{38}\\ b_i^{(w)} &= (\beta, \baspsi_i^{(w)}), \tag{39}\\ A^{(T)}_{i,j} &= \kappa(\nabla \baspsi_j^{(T)},\baspsi_i^{(T)}), \tag{40}\\ b_i^{(T)} &= \mu(\nabla w_{-}, \baspsi_i^{(T)}) \tp \tag{41} \end{align} $$

In the case we formulate one compound linear system involving both $ c^{(w)}_j $, $ j=0,\ldots,N_w $, and $ c^{(T)}_j $, $ j=0,\ldots,N_T $, (23)-(24) becomes $$ \begin{align} \sum_{j=0}^{N_w} A^{(w,w)}_{i,j} c^{(w)}_j + \sum_{j=0}^{N_T} A^{(w,T)}_{i,j} c^{(T)}_j &= b_i^{(w)},\quad i=0,\ldots,N_w, \tag{42}\\ \sum_{j=0}^{N_w} A^{(T,w)}_{i,j} c^{(w)}_j + \sum_{j=0}^{N_T} A^{(T,T)}_{i,j} c^{(T)}_j &= b_i^{(T)},\quad i=0,\ldots,N_T, \tag{43}\\ A^{(w,w)}_{i,j} &= \mu(\nabla \baspsi_j^{(w)},\baspsi_i^{(w)}), \tag{44}\\ A^{(w,T)}_{i,j} &= 0, \tag{45}\\ b_i^{(w)} &= (\beta, \baspsi_i^{(w)}), \tag{46}\\ A^{(w,T)}_{i,j} &= \mu (\nabla w_{-}\cdot\nabla\baspsi_j^{(w)}), \baspsi_i^{(T)}), \tag{47}\\ A^{(T,T)}_{i,j} &= \kappa(\nabla \baspsi_j^{(T)},\baspsi_i^{(T)}), \tag{48}\\ b_i^{(T)} &= 0 \tp \tag{49} \end{align} $$ The corresponding block form $$ \left(\begin{array}{cc} \mu K^{(w)} & 0\\ L & \kappa K^{(T)} \end{array}\right) \left(\begin{array}{c} c^{(w)}\\ c^{(T)} \end{array}\right) = \left(\begin{array}{c} b^{(w)}\\ 0 \end{array}\right), $$ has square and rectangular block matrices: $ K^{(w)} $ is $ N_w\times N_w $, $ K^{(T)} $ is $ N_T\times N_T $, while $ L $ is $ N_T\times N_w $,

Computations in 1D

We can reduce the system (5)-(6) to one space dimension, which corresponds to flow in a channel between two flat plates. Alternatively, one may consider flow in a circular pipe, introduce cylindrical coordinates, and utilize the radial symmetry to reduce the equations to a one-dimensional problem in the radial coordinate. The former model becomes $$ \begin{align} \mu w_{xx} &= -\beta, \tag{50}\\ \kappa T_{xx} &= - \mu w_x^2, \tag{51} \end{align} $$ while the model in the radial coordinate $ r $ reads $$ \begin{align} \mu\frac{1}{r}\frac{d}{dr}\left( r\frac{dw}{dr}\right) &= -\beta, \tag{52}\\ \kappa \frac{1}{r}\frac{d}{dr}\left( r\frac{dT}{dr}\right) &= - \mu \left( \frac{dw}{dr}\right)^2 \tp \tag{53} \end{align} $$

The domain for (50)-(51) is $ \Omega = [0,H] $, with boundary conditions $ w(0)=w(H)=0 $ and $ T(0)=T(H)=T_0 $. For (52)-(53) the domain is $ [0,R] $ ($ R $ being the radius of the pipe) and the boundary conditions are $ du/dr = dT/dr =0 $ for $ r=0 $, $ u(R)=0 $, and $ T(R)=T_0 $.

Calculations to be continued...