Solving multiple nonlinear algebraic equations

So far in this chapter, we have considered a single nonlinear algebraic equation. However, systems of such equations arise in a number of applications, foremost nonlinear ordinary and partial differential equations. Of the previous algorithms, only Newton's method is suitable for extension to systems of nonlinear equations.

Abstract notation

Suppose we have $n$ nonlinear equations, written in the following abstract form: $\begin{align} F_0(x_0,x_1,\ldots,x_n) &= 0, \tag{6.6}\\ F_1(x_0,x_1,\ldots,x_n) &= 0, \tag{6.7}\\ \vdots &= \vdots \tag{6.8}\\ F_n(x_0,x_1,\ldots,x_n) &= 0\thinspace . \tag{6.9}\\ \tag{6.10} \end{align}$ It will be convenient to introduce a vector notation $\F = (F_0,\ldots,F_1),\quad \x = (x_0,\ldots,x_n)\thinspace .$ The system can now be written as $\F (\x) = \boldsymbol{0}$ .

As a specific example on the notation above, the system $\begin{align} x^2 &= y - x\cos(\pi x) \tag{6.11}\\ yx + e^{-y} &= x^{-1} \tag{6.12} \end{align}$ can be written in our abstract form by introducing $x_0=x$ and $x_1=y$ . Then $\begin{align*} F_0(x_0,x_1) &= x^2 - y + x\cos(\pi x) = 0,\\ F_1(x_0,x_1) &= yx + e^{-y} - x^{-1} = 0\thinspace . \end{align*}$

Taylor expansions for multi-variable functions

We follow the ideas of Newton's method for one equation in one variable: approximate the nonlinear $f$ by a linear function and find the root of that function. When $n$ variables are involved, we need to approximate a vector function $\F(\x)$ by some linear function $\tilde\F = \J\x + \c$ , where $\J$ is an $n\times n$ matrix and $\c$ is some vector of length $n$ .

The technique for approximating $\F$ by a linear function is to use the first two terms in a Taylor series expansion. Given the value of $\F$ and its partial derivatives with respect to $\x$ at some point $\x_i$ , we can approximate the value at some point $\x_{i+1}$ by the two first term in a Taylor series expansion around $\x_i$ : $\F(\x_{i+1}) \approx \F(\x_i) + \nabla\F(\x_i)(\x_{i+1}-\x_i)\thinspace .$ The next terms in the expansions are omitted here and of size $||\x_{i+1}-\x_i||^2$ , which are assumed to be small compared with the two terms above.

The expression $\nabla\F$ is the matrix of all the partial derivatives of $\F$ . Component $(i,j)$ in $\nabla\F$ is $\frac{\partial F_i}{\partial x_j}\thinspace .$ For example, in our $2\times 2$ system (6.11)-(6.12) we can use SymPy to compute the Jacobian:

>>> from sympy import *
>>> x0, x1 = symbols('x0 x1')
>>> F0 = x0**2 - x1 + x0*cos(pi*x0)
>>> F1 = x0*x1 + exp(-x1) - x0**(-1)
>>> diff(F0, x0)
-pi*x0*sin(pi*x0) + 2*x0 + cos(pi*x0)
>>> diff(F0, x1)
-1
>>> diff(F1, x0)
x1 + x0**(-2)
>>> diff(F1, x1)
x0 - exp(-x1)

We can then write $\nabla\F = \left(\begin{array}{ll} \frac{\partial F_0}{\partial x_0} & \frac{\partial F_0}{\partial x_1}\\ \frac{\partial F_1}{\partial x_0} & \frac{\partial F_1}{\partial x_1} \end{array}\right) = \left(\begin{array}{ll} 2x_0 + \cos(\pi x_0) - \pi x_0\sin(\pi x_0) & -1 \\ x_1 + x_0^{-2} & x_0 - e^{-x_1} \end{array}\right)$

The matrix $\nabla\F$ is called the Jacobian of $\F$ and often denoted by $\J$ .

Newton's method

The idea of Newton's method is that we have some approximation $\x_i$ to the root and seek a new (and hopefully better) approximation $\x_{i+1}$ by approximating $\F(\x_{i+1})$ by a linear function and solve the corresponding linear system of algebraic equations. We approximate the nonlinear problem $\F(\x_{i+1})=0$ by the linear problem $\begin{equation} \F(\x_i) + \J(\x_i)(\x_{i+1}-\x_i) = \boldsymbol{0}, \tag{6.13} \end{equation}$ where $\J(\x_i)$ is just another notation for $\nabla\F(\x_i)$ . The equation (6.13) is a linear system with coefficient matrix $\J$ and right-hand side vector $\F(\x_i)$ . We therefore write this system in the more familiar form $\J(\x_i)\boldsymbol{\delta} = -\F(\x_i),$ where we have introduce a symbol $\boldsymbol{\delta}$ for the unknown vector $\x_{i+1}-\x_i$ that multiplies the Jacobian $\J$ .

The $i$ -th iteration of Newton's method for systems of algebraic equations consists of two steps:

Solve the linear system $\J(\x_i)\boldsymbol{\delta} = -\F(\x_i)$ with respect to $\boldsymbol{\delta}$ .
Set $\x_{i+1} = \x_i + \boldsymbol{\delta}$ .

Solving systems of linear equations must make use of appropriate software. Gaussian elimination is the most common, and in general the most robust, method for this purpose. Python's numpy package has a module linalg that interfaces the well-known LAPACK package with high-quality and very well tested subroutines for linear algebra. The statement x = numpy.linalg.solve(A, b) solves a system

$Ax=b$ with a LAPACK method based on Gaussian elimination.

When nonlinear systems of algebraic equations arise from discretization of partial differential equations, the Jacobian is very often sparse, i.e., most of its elements are zero. In such cases it is important to use algorithms that can take advantage of the many zeros. Gaussian elimination is then a slow method, and (much) faster methods are based on iterative techniques.

Implementation

Here is a very simple implementation of Newton's method for systems of nonlinear algebraic equations:

import numpy as np

def Newton_system(F, J, x, eps):
    """
    Solve nonlinear system F=0 by Newton's method.
    J is the Jacobian of F. Both F and J must be functions of x.
    At input, x holds the start value. The iteration continues
    until ||F|| < eps.
    """
    F_value = F(x)
    F_norm = np.linalg.norm(F_value, ord=2)  # l2 norm of vector
    iteration_counter = 0
    while abs(F_norm) > eps and iteration_counter < 100:
        delta = np.linalg.solve(J(x), -F_value)
        x = x + delta
        F_value = F(x)
        F_norm = np.linalg.norm(F_value, ord=2)
        iteration_counter += 1

    # Here, either a solution is found, or too many iterations
    if abs(F_norm) > eps:
        iteration_counter = -1
    return x, iteration_counter

We can test the function Newton_system with the $2\times 2$ system (6.11)-(6.12):

def test_Newton_system1():
    from numpy import cos, sin, pi, exp

    def F(x):
        return np.array(
            [x[0]**2 - x[1] + x[0]*cos(pi*x[0]),
             x[0]*x[1] + exp(-x[1]) - x[0]**(-1)])

    def J(x):
        return np.array(
            [[2*x[0] + cos(pi*x[0]) - pi*x[0]*sin(pi*x[0]), -1],
             [x[1] + x[0]**(-2), x[0] - exp(-x[1])]])

    expected = np.array([1, 0])
    tol = 1e-4
    x, n = Newton_system(F, J, x=np.array([2, -1]), eps=0.0001)
    print n, x
    error_norm = np.linalg.norm(expected - x, ord=2)
    assert error_norm < tol, 'norm of error =%g' % error_norm
    print 'norm of error =%g' % error_norm

Here, the testing is based on the L2 norm of the error vector. Alternatively, we could test against the values of x that the algorithm finds, with appropriate tolerances. For example, as chosen for the error norm, if eps=0.0001, a tolerance of $10^{-4}$ can be used for x[0] and x[1].