.. !split

.. _decay:impl1:

Implementation          (1)
===========================

We want to make a computer program for solving

.. math::
        
        u^{\prime}(t) = -au(t),\quad t\in (0,T], \quad u(0)=I,
        

by finite difference methods. The program should also display
the numerical solution as a curve on the
screen, preferably together with the
exact solution.

.. index:: directory

.. index:: folder

All programs referred to in this section are found in the
`src/alg <http://tinyurl.com/ofkw6kc/alg>`__ directory (we use the classical
Unix term *directory* for what many others nowadays call *folder*).

**Mathematical problem.**
We want to explore the Forward Euler scheme, the
Backward Euler, and the Crank-Nicolson schemes applied to our model problem.
From an implementational point of view, it is advantageous to
implement the :math:`\theta`-rule

.. math::
        
        u^{n+1} = \frac{1 - (1-\theta) a\Delta t}{1 + \theta a\Delta t}u^n,
        

since it can generate the three other schemes by various
choices of :math:`\theta`: :math:`\theta=0` for Forward Euler, :math:`\theta =1` for
Backward Euler, and :math:`\theta =1/2` for Crank-Nicolson.
Given :math:`a`, :math:`u^0=I`, :math:`T`, and :math:`\Delta t`,
our task is to use the :math:`\theta`-rule to
compute :math:`u^1, u^2,\ldots,u^{N_t}`, where :math:`t_{N_t}=N_t\Delta t`, and
:math:`N_t` the closest integer to :math:`T/\Delta t`.

Computer language: Python
-------------------------

Any programming language can be used to generate the :math:`u^{n+1}` values from
the formula above. However, in this document we shall mainly make use of
Python. There are several good reasons for this choice:

  * Python has a very clean, readable syntax (often known as
    "executable pseudo-code").

  * Python code is very similar to MATLAB code (and MATLAB has a
    particularly widespread use for scientific computing).

  * Python is a full-fledged, very powerful programming language.

  * Python is similar to C++, but is much simpler to work with and
    results in more reliable code.

  * Python has a rich set of modules for scientific computing, and its
    popularity in scientific computing is rapidly growing.

  * Python was made for being combined with compiled languages
    (C, C++, Fortran), so that existing numerical software can be reused,
    and thereby easing high computational performance with new implementations.

  * Python has extensive support for administrative tasks
    needed when doing large-scale computational investigations.

  * Python has extensive support for graphics (visualization,
    user interfaces, web applications).

Learning Python is easy. Many newcomers to the language will probably
learn enough from the forthcoming examples to perform their own computer
experiments. The examples start with simple Python code and gradually
make use of more powerful constructs as we proceed. Unless it is
inconvenient for the problem at hand, our Python code is made as
close as possible to MATLAB code for easy transition between the two
languages.

The coming programming examples assumes familiarity with
variables, for loops, lists, arrays,
functions, positional arguments, and keyword (named) arguments.
A background in basic MATLAB programming is often enough to understand
Python examples.
Readers who feel the Python examples are too hard to follow will
benefit from reading a tutorial, e.g.,

  * `The Official Python Tutorial <http://docs.python.org/2/tutorial/>`__

  * `Python Tutorial on tutorialspoint.com <http://www.tutorialspoint.com/python/>`__

  * `Interactive Python tutorial site <http://www.learnpython.org/>`__

  * `A Beginner's Python Tutorial <http://en.wikibooks.org/wiki/A_Beginner's_Python_Tutorial>`__

The author also has a comprehensive book [Ref02]_ that teaches
scientific programming with Python from the ground up.

.. bumpy list of refs?

.. _decay:py1:

Making a solver function
------------------------

We choose to have an array ``u`` for storing the :math:`u^n` values, :math:`n=0,1,\ldots,N_t`.
The algorithmic steps are

 1. initialize :math:`u^0`

 2. for :math:`t=t_n`, :math:`n=1,2,\ldots,N_t`: compute :math:`u_n` using
    the :math:`\theta`-rule formula

An implementation of a numerical algorithm is often referred to as
a *solver*. We shall now make a solver for our model problem and
realize the solver as a Python function. The function must take
the input data :math:`I`, :math:`a`, :math:`T`, :math:`\Delta t`, and :math:`\theta` of the problem
as arguments and return the solution as arrays ``u`` and ``t`` for
:math:`u^n` and :math:`t^n`, :math:`n=0,\ldots,N_t`. The solver function used as

.. code-block:: python

        u, t = solver(I, a, T, dt, theta)

One can now easily plot ``u`` versus ``t`` to visualize the solution.

The function ``solver`` may look as follows in Python:

.. code-block:: python

        from numpy import *
        
        def solver(I, a, T, dt, theta):
            """Solve u'=-a*u, u(0)=I, for t in (0,T] with steps of dt."""
            Nt = int(T/dt)            # no of time intervals
            T = Nt*dt                 # adjust T to fit time step dt
            u = zeros(Nt+1)           # array of u[n] values
            t = linspace(0, T, Nt+1)  # time mesh
        
            u[0] = I                  # assign initial condition
            for n in range(0, Nt):    # n=0,1,...,Nt-1
                u[n+1] = (1 - (1-theta)*a*dt)/(1 + theta*dt*a)*u[n]
            return u, t

The ``numpy`` library contains a lot of functions for array computing. Most
of the function names are similar to what is found
in the alternative scientific computing language MATLAB. Here
we make use of

 * ``zeros(Nt+1)`` for creating an array of size ``Nt+1``
   and initializing the elements to zero

 * ``linspace(0, T, Nt+1)`` for creating an array with ``Nt+1``
   coordinates uniformly distributed between ``0`` and ``T``

The ``for`` loop deserves a comment, especially for newcomers to Python.
The construction ``range(0, Nt, s)`` generates all integers from ``0`` to ``Nt``
in steps of ``s``, *but not including* ``Nt``. Omitting ``s`` means ``s=1``.
For example, ``range(0, 6, 3)``
gives ``0`` and ``3``, while ``range(0, 6)`` generates
the list ``[0, 1, 2, 3, 4, 5]``.
Our loop implies the following assignments to ``u[n+1]``: ``u[1]``, ``u[2]``, ...,
``u[Nt]``, which is what we want since ``u`` has length ``Nt+1``.
The first index in Python arrays or lists is *always* ``0`` and the
last is then ``len(u)-1`` (the length of an array ``u`` is obtained by
``len(u)`` or ``u.size``).

.. _decay:py2:

Integer division
----------------

The shown implementation of the ``solver`` may face problems and
wrong results if ``T``, ``a``, ``dt``, and ``theta`` are given as integers
(see :ref:`decay:exer:intdiv` and :ref:`decay:exer:decay1err`).
The problem is related to *integer division* in Python (as
in Fortran, C, C++, and many other computer languages!): ``1/2`` becomes ``0``,
while ``1.0/2``, ``1/2.0``, or ``1.0/2.0`` all become ``0.5``. So, it is enough
that at least the nominator or the denominator is a real number
(i.e., a ``float`` object)
to ensure a correct mathematical division. Inserting
a conversion ``dt = float(dt)``
guarantees that ``dt`` is
``float``.

Another problem with computing :math:`N_t=T/\Delta t` is that we should
round :math:`N_t` to the nearest integer. With ``Nt = int(T/dt)`` the ``int``
operation picks the largest integer smaller than ``T/dt``. Correct
mathematical rounding as known from school is obtained by

.. code-block:: python

        Nt = int(round(T/dt))

The complete version of our improved, safer ``solver`` function then becomes

.. code-block:: python

        from numpy import *
        
        def solver(I, a, T, dt, theta):
            """Solve u'=-a*u, u(0)=I, for t in (0,T] with steps of dt."""
            dt = float(dt)            # avoid integer division
            Nt = int(round(T/dt))     # no of time intervals
            T = Nt*dt                 # adjust T to fit time step dt
            u = zeros(Nt+1)           # array of u[n] values
            t = linspace(0, T, Nt+1)  # time mesh
        
            u[0] = I                  # assign initial condition
            for n in range(0, Nt):    # n=0,1,...,Nt-1
                u[n+1] = (1 - (1-theta)*a*dt)/(1 + theta*dt*a)*u[n]
            return u, t

Doc strings
-----------

.. index:: doc strings

Right below the header line in the ``solver`` function there is a
Python string enclosed in triple double quotes ``"""``.
The purpose of this string object is to document what the function
does and what the arguments are. In this case the necessary
documentation does not span more than one line, but with triple double
quoted strings the text may span several lines:

.. code-block:: python

        def solver(I, a, T, dt, theta):
            """
            Solve
        
                u'(t) = -a*u(t),
        
            with initial condition u(0)=I, for t in the time interval
            (0,T]. The time interval is divided into time steps of
            length dt.
        
            theta=1 corresponds to the Backward Euler scheme, theta=0
            to the Forward Euler scheme, and theta=0.5 to the Crank-
            Nicolson method.
            """
            ...

Such documentation strings appearing right after the header of
a function are called *doc strings*. There are tools that can automatically
produce nicely formatted documentation by extracting the definition of
functions and the contents of doc strings.

It is strongly recommended to equip any function with a doc string,
unless the purpose of the function
is not obvious. Nevertheless, the forthcoming
text deviates from this rule if the function is explained in the text.

Formatting numbers
------------------

Having computed the discrete solution ``u``, it is natural to look at
the numbers:

.. code-block:: python

        # Write out a table of t and u values:
        for i in range(len(t)):
            print t[i], u[i]

This compact ``print`` statement unfortunately gives less readable output
because the ``t`` and ``u`` values are not aligned in nicely formatted columns.
To fix this problem, we recommend to use the *printf format*, supported in most
programming languages inherited from C. Another choice is
Python's recent *format string syntax*. Both kinds of syntax are illustrated
below.

.. index:: printf format

Writing ``t[i]`` and ``u[i]`` in two nicely formatted columns is done like
this with the printf format:

.. code-block:: python

        print 't=%6.3f u=%g' % (t[i], u[i])

The percentage signs signify "slots" in the text where the variables
listed at the end of the statement are inserted. For each "slot" one
must specify a format for how the variable is going to appear in the
string: ``f`` for float (with 6 decimals),
``s`` for pure text, ``d`` for an integer, ``g`` for a real number
written as compactly as possible, ``9.3E`` for scientific notation with
three decimals in a field of width 9 characters (e.g., ``-1.351E-2``),
or ``.2f`` for standard decimal notation with two decimals
formatted with minimum width. The printf syntax provides a quick way
of formatting tabular output of numbers with full control of the
layout.

.. index:: format string syntax (Python)

The alternative *format string syntax* looks like

.. code-block:: python

        print 't={t:6.3f} u={u:g}'.format(t=t[i], u=u[i])

As seen, this format allows logical names in the "slots" where
``t[i]`` and ``u[i]`` are to be inserted. The "slots" are surrounded
by curly braces, and the logical name is followed by a colon and
then the printf-like specification of how to format real numbers,
integers, or strings.

Running the program
-------------------

The function and main program shown above must be placed in a file,
say with name `decay_v1.py <http://tinyurl.com/ofkw6kc/alg/decay_v1.py>`__ (``v1`` for 1st version of this program).  Make sure you
write the code with a suitable text editor (Gedit, Emacs, Vim,
Notepad++, or similar).  The program is run by executing the file this
way:

.. code-block:: text

        Terminal> python decay_v1.py

The text ``Terminal>`` just indicates a prompt in a
Unix/Linux or DOS terminal window. After this prompt, which may look
different in your terminal window (depending on the terminal application
and how it is set up), commands like ``python decay_v1.py`` can be issued.
These commands are interpreted by the operating system.

We strongly recommend to run Python programs within the IPython shell.
First start IPython by typing ``ipython`` in the terminal window.
Inside the IPython shell, our program ``decay_v1.py`` is run by the command
``run decay_v1.py``:

.. code-block:: text

        Terminal> ipython
        
        In [1]: run decay_v1.py
        t= 0.000 u=1
        t= 0.800 u=0.384615
        t= 1.600 u=0.147929
        t= 2.400 u=0.0568958
        t= 3.200 u=0.021883
        t= 4.000 u=0.00841653
        t= 4.800 u=0.00323713
        t= 5.600 u=0.00124505
        t= 6.400 u=0.000478865
        t= 7.200 u=0.000184179
        t= 8.000 u=7.0838e-05

The advantage of running programs in IPython are many, but here
we explicitly mention a few of the most
useful features:

 * previous commands are easily recalled with the up arrow,

 * ``%pdb`` turns on a debugger so that variables can be examined if the program
   aborts (due to a Python exception),

 * output of commands are stored in variables,

 * the computing time spent on a set of statements can be measured with
   the ``%timeit`` command,

 * any operating system command can be executed,

 * modules can be loaded automatically and other customizations can
   be performed when starting IPython

Although running programs in IPython is strongly recommended, most
execution examples in the forthcoming text use the standard
Python shell with prompt ``>>>`` and run programs through
a typesetting like

.. code-block:: text

        Terminal> python programname

The reason is that such typesetting
makes the text more compact in the vertical direction
than showing sessions with IPython syntax.

.. index:: plotting curves

.. index:: visualizing curves

Plotting the solution
---------------------

Having the ``t`` and ``u`` arrays, the approximate solution ``u`` is visualized
by the intuitive command ``plot(t, u)``:

.. code-block:: python

        from matplotlib.pyplot import *
        plot(t, u)
        show()

It will be illustrative to also plot the exact solution
:math:`{u_{\small\mbox{e}}}(t)=Ie^{-at}` for comparison. We first
need to make a Python function for computing the exact solution:

.. code-block:: python

        def u_exact(t, I, a):
            return I*exp(-a*t)

It is tempting to just do

.. code-block:: python

        u_e = u_exact(t, I, a)
        plot(t, u, t, u_e)

However, this is not exactly what we want: the ``plot`` function draws
straight lines between the discrete points ``(t[n], u_e[n])`` while
:math:`{u_{\small\mbox{e}}}(t)` varies as an exponential function between the mesh points.
The technique for showing the "exact" variation of :math:`{u_{\small\mbox{e}}}(t)` between
the mesh points is to introduce a very fine mesh for :math:`{u_{\small\mbox{e}}}(t)`:

.. code-block:: python

        t_e = linspace(0, T, 1001)      # fine mesh
        u_e = u_exact(t_e, I, a)

We can also plot the curves with different colors and styles, e.g.,

.. code-block:: python

        plot(t_e, u_e, 'b-',         # blue line for u_e
             t,   u,   'r--o')       # red dashes w/circles

With more than one curve in the plot we need to associate each curve
with a legend. We also want appropriate names on the axes, a title,
and a file containing the plot as an image for inclusion in reports.
The Matplotlib package (``matplotlib.pyplot``) contains functions for
this purpose. The names of the functions are similar to the plotting
functions known from MATLAB.  A complete function for creating
the comparison plot becomes

.. code-block:: python

        from matplotlib.pyplot import *
        
        def plot_numerical_and_exact(theta, I, a, T, dt):
            """Compare the numerical and exact solution in a plot."""
            u, t = solver(I=I, a=a, T=T, dt=dt, theta=theta)
        
            t_e = linspace(0, T, 1001)        # fine mesh for u_e
            u_e = u_exact(t_e, I, a)
        
            plot(t,   u,   'r--o',            # red dashes w/circles
                 t_e, u_e, 'b-')              # blue line for exact sol.
            legend(['numerical', 'exact'])
            xlabel('t')
            ylabel('u')
            title('theta=%g, dt=%g' % (theta, dt))
            savefig('plot_%s_%g.png' % (theta, dt))
        
        plot_numerical_and_exact(I=1, a=2, T=8, dt=0.8, theta=1)
        show()

Note that ``savefig`` here creates a PNG file whose name includes the
values of :math:`\theta` and :math:`\Delta t` so that we can easily distinguish
files from different runs with :math:`\theta` and :math:`\Delta t`.

The complete code is found in the file
`decay_v2.py <http://tinyurl.com/ofkw6kc/alg/decay_v2.py>`__. The resulting plot
is shown in Figure :ref:`decay:fig:v2`. As seen, there is quite some
discrepancy between the exact and the numerical solution.
Fortunately, the numerical solution approaches the exact one as
:math:`\Delta t` is reduced.

.. _decay:fig:v2:

.. figure:: decay_v2.png
   :width: 500

   *Comparison of numerical and exact solution*

Verifying the implementation
----------------------------

It is easy to make mistakes while deriving and implementing numerical
algorithms, so we should never believe in the solution before it has
been thoroughly verified.


.. admonition:: Verification and validation

   The purpose of *verifying* a program is to bring evidence for the
   property that there are no errors in the implementation. A related
   term, *validate* (and *validation*),
   addresses the question if the ODE model is a good
   representation of the phenomena we want to simulate. To remember the
   difference between verification and validation, verification is
   about *solving the equations right*, while validation is about *solving
   the right equations*. We must always perform a verification before
   it is meaningful to believe in the computations and perform validation
   (which compares the program results with physical experiments or observations).


The most obvious idea for verification
in our case is to compare the numerical solution with the exact
solution, when that exists. This is, however, not a particularly good
method. The reason is that there will always
be a discrepancy
between these two solutions, due to numerical
approximations, and we cannot precisely quantify the approximation
errors. The open question is therefore whether we have the
mathematically correct
discrepancy or if we have another, maybe small,
discrepancy due to both an approximation error *and* an error in the
implementation. It is thus
impossible to judge whether the program is correct or not by
just looking at the graphs in Figure :ref:`decay:fig:v2`.

To avoid
mixing the unavoidable numerical approximation errors and the
undesired implementation errors, we should try to make tests where
we have some exact
computation of the discrete solution or at least parts of it.
Examples will show how this can be done.

Running a few algorithmic steps by hand
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The simplest approach to produce a correct non-trivial reference
solution for the discrete solution :math:`u`, is to compute a few steps of
the algorithm by hand. Then we can compare the hand calculations with
numbers produced by the program.

A straightforward approach is to use a calculator and
compute :math:`u^1`, :math:`u^2`, and :math:`u^3`. With :math:`I=0.1`, :math:`\theta=0.8`,
and :math:`\Delta t =0.8` we get

.. math::
         A\equiv \frac{1 - (1-\theta) a\Delta t}{1 + \theta a \Delta t} = 0.298245614035

.. math::
        
        u^1 &= AI=0.0298245614035,\\ 
        u^2 &= Au^1= 0.00889504462912,\\ 
        u^3 &=Au^2= 0.00265290804728
        

.. index:: test function

.. index:: unit testing

Comparison of these manual calculations with the result of the
``solver`` function is carried out in the function

.. code-block:: python

        def test_solver_three_steps():
            """Compare three steps with known manual computations."""
            theta = 0.8; a = 2; I = 0.1; dt = 0.8
            u_by_hand = array([I,
                               0.0298245614035,
                               0.00889504462912,
                               0.00265290804728])
        
            Nt = 3  # number of time steps
            u, t = solver(I=I, a=a, T=Nt*dt, dt=dt, theta=theta)
        
            tol = 1E-15  # tolerance for comparing floats
            diff = abs(u - u_by_hand).max()
            success = diff < tol
            assert success

The ``test_solver_three_steps`` function follows widely used conventions
for *unit testing*. By following such conventions we can at a later
stage easily execute a big test suite for our software. That is, after
a small modification is made to the program, we can by typing just
a short command, run through a large number of tests to check that the
modifications do not break any computations.
The conventions boil down to three rules:

 * The test function name must start with ``test_`` and the function
   cannot take any arguments.

 * The test must end up in a boolean expression that is ``True`` if
   the test was passed and ``False`` if it failed.

 * The function must run ``assert`` on the boolean expression, resulting
   in program abortion (due to an ``AssertionError`` exception) if
   the test failed.

A typical ``assert`` statement is to check that a computed result ``c``
equals the expected value ``e``: ``assert c == e``. However, since real
numbers are stored in a computer using only 64 units, most numbers
will feature a small rounding error, typically of size :math:`10^{-16}`.
That is, real numbers on a computer have finite precision. When doing
arithmetics with finite precision numbers, the rounding errors may
accumulate or not, depending on the algorithm.  It does not make sense
to test ``c == e``, since a small rounding error will cause the test to
fail. Instead, we use an equality with *tolerance* ``tol``: ``abs(e - c)
< tol``. The ``test_solver_three_steps`` functions applies this type of
test with a tolerance :math:`01^{-15}`.

The main program can routinely run the verification test prior to
solving the real problem:

.. code-block:: python

        test_solver_three_steps()
        plot_numerical_and_exact(I=1, a=2, T=8, dt=0.8, theta=1)
        show()

(Rather than calling ``test_*()`` functions explicitly, one will
normally ask a testing framework like nose
or pytest to find and run such functions.)
The complete program including the verification above is
found in the file `decay_v3.py <http://tinyurl.com/ofkw6kc/alg/decay_v3.py>`__.

.. _decay:computing:error:

Computing the numerical error as a mesh function
------------------------------------------------

Now that we have some evidence for a correct implementation, we are in
position to compare the computed :math:`u^n` values in the ``u`` array with
the exact :math:`u` values at the mesh points, in order to study the error
in the numerical solution.

.. index:: representative (mesh function)

A natural way to compare the exact and discrete solutions is to
calculate their difference as a mesh function for the error:

.. _Eq:_auto8:

.. math::

    \tag{46}
    e^n = {u_{\small\mbox{e}}}(t_n) - u^n,\quad n=0,1,\ldots,N_t {\thinspace .}
        
        
We may view the mesh function
:math:`{u_{\small\mbox{e}}}^n = {u_{\small\mbox{e}}}(t_n)` as a representation of the continuous function :math:`{u_{\small\mbox{e}}}(t)`
defined for all :math:`t\in [0,T]`. In fact,
:math:`{u_{\small\mbox{e}}}^n` is often called the *representative* of
:math:`{u_{\small\mbox{e}}}` on the mesh. Then, :math:`e^n = {u_{\small\mbox{e}}}^n - u^n` is clearly
the difference of two mesh functions.

The error mesh function :math:`e^n` can be computed by

.. code-block:: python

        u, t = solver(I, a, T, dt, theta)  # Numerical sol.
        u_e = u_exact(t, I, a)             # Representative of exact sol.
        e = u_e - u

Note that the mesh functions ``u`` and ``u_e`` are represented by arrays
and associated with the points in the array ``t``.

.. index:: array arithmetics

.. index:: array computing

.. index:: vectorization


.. admonition:: Array arithmetics

   The last statements
   
   .. code-block:: python
   
           u_e = u_exact(t, I, a)
           e = u_e - u
   
   demonstrate some standard examples of array arithmetics: ``t`` is an
   array of mesh points that we pass to ``u_exact``. This function
   evaluates ``-a*t``, which is a scalar times an array, meaning that
   the scalar is multiplied with each array element.
   The result is an array, let us call it ``tmp1``. Then
   ``exp(tmp1)`` means applying the exponential function to each element in
   ``tmp1``, giving an array, say ``tmp2``. Finally, ``I*tmp2`` is computed
   (scalar times array) and ``u_e`` refers to this array returned from
   ``u_exact``. The expression ``u_e - u`` is the difference between
   two arrays, resulting in a new array referred to by ``e``.
   
   Replacement of array element computations inside a loop by array
   arithmetics is known as *vectorization*.


.. _decay:computing:error:norm:

Computing the norm of the error mesh function
---------------------------------------------

.. index:: continuous function norms

.. index::
   single: norm; continuous

Instead of working with the error :math:`e^n` on the entire mesh, we
often want a single number expressing the size of the error.
This is obtained by taking the norm of the error function.

Let us first define norms of a function :math:`f(t)`
defined for all :math:`t\in [0,T]`.
Three common norms are

.. _Eq:decay:norms:L2:

.. math::

    \tag{47}
    ||f||_{L^2} = \left( \int_0^T f(t)^2 dt\right)^{1/2},
        
        
.. _Eq:decay:norms:L1:

.. math::

    \tag{48}
    ||f||_{L^1} = \int_0^T |f(t)| dt,
        
        
.. _Eq:decay:norms:Linf:

.. math::

    \tag{49}
    ||f||_{L^\infty} = \max_{t\in [0,T]}|f(t)|{\thinspace .}
        
        
The :math:`L^2` norm :ref:`(47) <Eq:decay:norms:L2>` ("L-two norm")
has nice mathematical properties and
is the most popular norm. It is a generalization
of the well-known Eucledian norm of vectors to functions.
The :math:`L^1` norm looks simpler and more intuitive, but has less
nice mathematical properties compared to the two other norms, so
it is much less used in computations.
The :math:`L^\infty` is also called the max norm or the supremum norm
and is widely used. It focuses on a single point with the largest
value of :math:`|f|`, while the other norms measure average behavior of
the function.

In fact, there is a whole family of norms,

.. _Eq:_auto9:

.. math::

    \tag{50}
    ||f||_{L^p} = \left(\int_0^T f(t)^pdt\right)^{1/p},
        
        
with :math:`p` real. In particular,
:math:`p=1` corresponds to the :math:`L^1` norm above while :math:`p=\infty` is the
:math:`L^\infty` norm.

.. index:: discrete function norms

.. index:: mesh function norms

.. index::
   single: norm; discrete (mesh function)

Numerical computations involving mesh functions need corresponding norms.
Given a set of function values, :math:`f^n`, and some associated mesh points, :math:`t_n`,
a numerical integration rule can be used to calculate the :math:`L^2` and
:math:`L^1` norms defined above. Imagining that the mesh function is extended
to vary linearly between the mesh points, the Trapezoidal rule is
in fact an exact integration rule. A possible modification of the :math:`L^2`
norm for a mesh function :math:`f^n` on a uniform mesh with spacing :math:`\Delta t`
is therefore the well-known Trapezoidal integration formula

.. math::
         ||f^n|| = \left(\Delta t\left(\frac{1}{2}(f^0)^2 + \frac{1}{2}(f^{N_t})^2
        + \sum_{n=1}^{N_t-1} (f^n)^2\right)\right)^{1/2} 

A common approximation of this expression, motivated by the
convenience of having a simpler formula, is

.. math::
         ||f^n||_{\ell^2} = \left(\Delta t\sum_{n=0}^{N_t} (f^n)^2\right)^{1/2} {\thinspace .}

This is called the discrete :math:`L^2` norm and denoted by :math:`\ell^2`.
If :math:`||f||_{\ell^2}^2` (i.e., the square of the norm) is used
instead of the Trapezoidal integration formula,
the error
is :math:`\Delta t((f^0)^2 + (f^{N_t})^2)/2`. This means that the
weights at the end points of the mesh function are perturbed,
but as :math:`\Delta t\rightarrow 0`, the error from this perturbation goes
to zero. As long as we are consistent and
stick to one kind of integration
rule for the norm of a mesh function, the details and accuracy of this
rule is of no concern.

The three discrete norms for a mesh function :math:`f^n`, corresponding to
the :math:`L^2`, :math:`L^1`, and :math:`L^\infty` norms of :math:`f(t)` defined above, are
defined by

.. _Eq:decay:norms:l2:

.. math::

    \tag{51}
    ||f^n||_{\ell^2} = \left( \Delta t\sum_{n=0}^{N_t} (f^n)^2\right)^{1/2},
        
        
.. _Eq:decay:norms:l1:

.. math::

    \tag{52}
    ||f^n||_{\ell^1} = \Delta t\sum_{n=0}^{N_t} |f^n|,
        
        
.. _Eq:decay:norms:linf:

.. math::

    \tag{53}
    ||f^n||_{\ell^\infty} = \max_{0\leq n\leq N_t}|f^n|{\thinspace .}
        
        
Note that the :math:`L^2`, :math:`L^1`, :math:`\ell^2`, and :math:`\ell^1` norms depend on the
length of the interval of interest (think of :math:`f=1`, then the
norms are proportional to :math:`\sqrt{T}` or :math:`T`). In some applications it
is convenient to think of a mesh function as just a vector of function
values without any relation to the interval :math:`[0,T]`.
Then one can replace :math:`\Delta t` by :math:`T/N_t` and simply drop :math:`T` (which
is just a common scaling factor in the norm,
independent of the vector of function
values). Moreover, people prefer
to divide by the total length of the vector, :math:`N_t+1`, instead of :math:`N_t`.
This reasoning gives rise to the *vector norms* for a vector
:math:`f=(f_0,\ldots,f_{N})`:

.. _Eq:decay:norms:vl2:

.. math::

    \tag{54}
    ||f||_2 = \left( \frac{1}{N+1}\sum_{n=0}^{N} (f_n)^2\right)^{1/2},
        
        
.. _Eq:decay:norms:vl1:

.. math::

    \tag{55}
    ||f||_1 = \frac{1}{N+1}\sum_{n=0}^{N} |f_n|,
        
        
.. _Eq:decay:norms:vlinf:

.. math::

    \tag{56}
    ||f||_{\ell^\infty} = \max_{0\leq n\leq N}|f_n|{\thinspace .}
        
        
Here we have used the common vector component notation with subscripts
(:math:`f_n`) and :math:`N` as length. We will mostly work with mesh functions
and use the discrete :math:`\ell^2`
norm :ref:`(51) <Eq:decay:norms:l2>` or the max norm :math:`\ell^\infty`
:ref:`(53) <Eq:decay:norms:linf>`, but the corresponding vector norms
:ref:`(54) <Eq:decay:norms:vl2>`-:ref:`(56) <Eq:decay:norms:vlinf>` are also much used
in numerical computations, so it is important to know the different
norms and the relations between them.

.. index::
   single: error; norms

A single number that expresses the size of the numerical error
will be taken as :math:`||e^n||_{\ell^2}` and called :math:`E`:

.. _Eq:decay:E:

.. math::

    \tag{57}
    E = \sqrt{\Delta t\sum_{n=0}^{N_t} (e^n)^2}
        
        
The corresponding Python code, using array arithmetics, reads

.. code-block:: python

        E = sqrt(dt*sum(e**2))

The ``sum`` function comes from ``numpy`` and computes the sum of the elements
of an array. Also the ``sqrt`` function is from ``numpy`` and computes the
square root of each element in the array argument.

.. index:: scalar computing

Scalar computing
~~~~~~~~~~~~~~~~

Instead of doing array computing ``sqrt(dt*sum(e**2))`` we can compute with
one element at a time:

.. code-block:: python

        m = len(u)     # length of u array (alt: u.size)
        u_e = zeros(m)
        t = 0
        for i in range(m):
            u_e[i] = u_exact(t, a, I)
            t = t + dt
        e = zeros(m)
        for i in range(m):
            e[i] = u_e[i] - u[i]
        s = 0  # summation variable
        for i in range(m):
            s = s + e[i]**2
        error = sqrt(dt*s)

Such element-wise computing, often called *scalar* computing, takes
more code, is less readable, and runs much slower than what we
can achieve with array computing.

Experiments with computing and plotting
---------------------------------------

Let us write down a new function that wraps up the computation and all
the plotting statements used for comparing the exact and numerical
solutions. This function can be called with various :math:`\theta` and
:math:`\Delta t` values to see how the error depends on the method and mesh
resolution.

.. code-block:: python

        def explore(I, a, T, dt, theta=0.5, makeplot=True):
            """
            Run a case with the solver, compute error measure,
            and plot the numerical and exact solutions (if makeplot=True).
            """
            u, t = solver(I, a, T, dt, theta)    # Numerical solution
            u_e = u_exact(t, I, a)
            e = u_e - u
            E = sqrt(dt*sum(e**2))
            if makeplot:
                figure()                         # create new plot
                t_e = linspace(0, T, 1001)       # fine mesh for u_e
                u_e = u_exact(t_e, I, a)
                plot(t,   u,   'r--o')           # red dashes w/circles
                plot(t_e, u_e, 'b-')             # blue line for exact sol.
                legend(['numerical', 'exact'])
                xlabel('t')
                ylabel('u')
                title('theta=%g, dt=%g' % (theta, dt))
                theta2name = {0: 'FE', 1: 'BE', 0.5: 'CN'}
                savefig('%s_%g.png' % (theta2name[theta], dt))
                savefig('%s_%g.pdf' % (theta2name[theta], dt))
                show()
            return E

The ``figure()`` call is key: without it, a new ``plot`` command will
draw the new pair of curves in the same plot window, while we want
the different pairs to appear in separate windows and files.
Calling ``figure()`` ensures this.

Instead of including the :math:`\theta` value in the filename to implicitly
inform about the applied method, the code utilizes a little Python
dictionary that maps each relevant :math:`\theta` value to a corresponding
acronym for the method name (FE, BE, or CN):

.. code-block:: python

        theta2name = {0: 'FE', 1: 'BE', 0.5: 'CN'}
        savefig('%s_%g.png' % (theta2name[theta], dt))

.. index:: PNG plot

.. index:: PDF plot

.. index:: EPS plot

.. index:: viewing graphics files

The ``explore`` function stores the plot in two different image file formats:
PNG and PDF. The PNG format is suitable for
being included in HTML documents, while the PDF format provides
higher quality for LaTeX (i.e., pdfLaTeX) documents.
Frequently used viewers for these
image files on Unix systems are ``gv`` (comes with Ghostscript)
for the PDF format and
``display`` (from the ImageMagick software suite) for PNG files:

.. code-block:: text

        Terminal> gv BE_0.5.pdf
        Terminal> display BE_0.5.png

A main program may run a loop over the three methods (given by
their corresponding :math:`\theta` values)
and call ``explore`` to compute errors and make plots:

.. code-block:: python

        def main(I, a, T, dt_values, theta_values=(0, 0.5, 1)):
            print 'theta   dt       error'  # Column headings in table
            for theta in theta_values:
                for dt in dt_values:
                    E = explore(I, a, T, dt, theta, makeplot=True)
                    print '%4.1f %6.2f: %12.3E' % (theta, dt, E)
        
        main(I=1, a=2, T=5, dt_values=[0.4, 0.04])

The file `decay_plot_mpl.py <http://tinyurl.com/ofkw6kc/alg/decay_plot_mpl.py>`__
contains the complete code with the functions above.
Running this program results in

.. code-block:: text

        Terminal> python decay_plot_mpl.py
        theta   dt       error
         0.0   0.40:    2.105E-01
         0.0   0.04:    1.449E-02
         0.5   0.40:    3.362E-02
         0.5   0.04:    1.887E-04
         1.0   0.40:    1.030E-01
         1.0   0.04:    1.382E-02

We observe that reducing :math:`\Delta t` by a factor of 10 increases the
accuracy for all three methods. We also see that
the combination of :math:`\theta=0.5` and a small time step :math:`\Delta t =0.04`
gives a much more accurate solution, and that :math:`\theta=0` and :math:`\theta=1`
with :math:`\Delta t = 0.4` result in the least accurate solutions.

Figure :ref:`decay:fig:FE1` demonstrates that the numerical solution
produced by the Forward Euler method with
:math:`\Delta t=0.4` clearly lies below the exact curve, but that the
accuracy improves considerably by reducing the time step by a factor
of 10.

.. _decay:fig:FE1:

.. figure:: FE1.png
   :width: 600

   *The Forward Euler scheme for two values of the time step*

The behavior of the two other schemes is shown in Figures :ref:`decay:fig:BE1`
and :ref:`decay:fig:CN1`. Crank-Nicolson is obviously the most accurate
scheme from this visual point of view.

.. _decay:fig:BE1:

.. figure:: BE1.png
   :width: 600

   *The Backward Euler scheme for two values of the time step*

.. _decay:fig:CN1:

.. figure:: CN1.png
   :width: 600

   *The Crank-Nicolson scheme for two values of the time step*

.. index:: cropping images

.. index:: montage program

Combining plot files          (1)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Mounting two PNG files beside each other, as done in Figures
:ref:`decay:fig:FE1`-:ref:`decay:fig:CN1`, is easily carried out by the
`montage <http://www.imagemagick.org/script/montage.php>`__ program
from the ImageMagick suite:

.. code-block:: text

        Terminal> montage -background white -geometry 100% -tile 2x1 \ 
                  FE_0.4.png FE_0.04.png FE1.png
        Terminal> convert -trim FE1.png FE1.png

The ``-geometry`` argument is used to specify the size of the image. Here,
we preserve the individual sizes of the images. The ``-tile HxV`` option
specifies ``H`` images in the horizontal direction and ``V`` images in
the vertical direction. A series of image files to be combined are then listed,
with the name of the resulting combined image, here ``FE1.png`` at the end.
The ``convert -trim`` command removes surrounding white areas in the figure
(an operation usually known as *cropping* in image manipulation programs).

.. index:: pdftk program

.. index:: pdfnup program

.. index:: pdfcrop program

For LaTeX reports it is not recommended to use ``montage`` and PNG files
as the result has too low resolution. Instead, plots should be made
in the PDF format and combined using the ``pdftk``, ``pdfnup``, and ``pdfcrop`` tools
(on Linux/Unix):

.. code-block:: text

        Terminal> pdftk FE_0.4.png FE_0.04.png output tmp.pdf
        Terminal> pdfnup --nup 2x1 --outfile tmp.pdf tmp.pdf
        Terminal> pdfcrop tmp.pdf FE1.png  # output in FE1.png

Here, ``pdftk`` combines images into a multi-page PDF file, ``pdfnup``
combines the images in individual pages to a table of images (pages),
and ``pdfcrop`` removes white margins in the resulting combined image file.

Plotting with SciTools
~~~~~~~~~~~~~~~~~~~~~~

The `SciTools package <https://github.com/hplgit/scitools>`__ provides a
unified plotting interface, called Easyviz, to many different plotting
packages, including Matplotlib, Gnuplot, Grace, MATLAB,
VTK, OpenDX, and VisIt. The syntax is very similar to that of
Matplotlib and MATLAB. In fact, the plotting commands shown above look
the same in SciTool's Easyviz interface, apart from the import
statement, which reads

.. code-block:: python

        from scitools.std import *

This statement performs a ``from numpy import *`` as well as an import
of the most common pieces of the Easyviz (``scitools.easyviz``) package,
along with some additional numerical functionality.

With Easyviz one can
merge several plotting commands into a single one
using keyword arguments:

.. code-block:: python

        plot(t,   u,   'r--o',           # red dashes w/circles
             t_e, u_e, 'b-',             # blue line for exact sol.
             legend=['numerical', 'exact'],
             xlabel='t',
             ylabel='u',
             title='theta=%g, dt=%g' % (theta, dt),
             savefig='%s_%g.png' % (theta2name[theta], dt),
             show=True)

The `decay_plot_st.py <http://tinyurl.com/ofkw6kc/alg/decay_plot_st.py>`__ file
contains such a demo.

By default, Easyviz employs Matplotlib for plotting, but `Gnuplot <http://www.gnuplot.info/>`__ and `Grace <http://plasma-gate.weizmann.ac.il/Grace/>`__ are viable alternatives:

.. code-block:: text

        Terminal> python decay_plot_st.py --SCITOOLS_easyviz_backend gnuplot
        Terminal> python decay_plot_st.py --SCITOOLS_easyviz_backend grace

The actual tool used for creating plots (called *backend*)
and numerous other options
can be permanently set in SciTool's configuration file.

All the Gnuplot windows are launched without any need to kill one before
the next one pops up (as is the case with Matplotlib) and one can
press the key 'q' anywhere in a plot window to kill it.
Another advantage of Gnuplot is the automatic choice of sensible
and distinguishable line types in black-and-white PDF and PostScript
files.

For more detailed information on syntax and plotting capabilities,
we refer to the Matplotlib [Ref03]_
and SciTools [Ref04]_ documentation.
The hope is that
the programming syntax explained so far suffices for understanding the
basic plotting functionality and being able to look up
the cited technical documentation.


.. admonition:: Test your understanding

   :ref:`decay:app:exer:cooling:py` asks you to implement
   a solver for a problem that is slightly different from the
   one above. You may use the ``solver`` and ``explore`` functions
   explained above as a starting point. Apply the new solver
   to solve :ref:`decay:app:exer:cooling:murder`.


Memory-saving implementation
----------------------------

The computer memory requirements of our implementations so far consist
mainly of the ``u`` and ``t`` arrays, both of length :math:`N_t+1`.  Also, for
the programs that involve array arithmetics, Python needs memory space
for storing temporary arrays. For example, computing ``I*exp(-a*t)``
requires storing the intermediate result ``a*t`` before the preceding
minus sign can be applied. The resulting array is temporarily stored
and provided as input to the ``exp`` function.  Regardless of how we
implement simple ODE problems, storage requirements are very modest
and put no restrictions on how we choose our data structures and
algorithms.  Nevertheless, when the presented methods are applied to
three-dimensional PDE problems, memory storage requirements suddenly
become a challenging issue.

Let us briefly elaborate on how large the storage requirements can
quickly be in three-dimensional problems.  The PDE counterpart to our
model problem :math:`u'=-a` is a diffusion equation :math:`u_t = a\nabla^2 u`
posed on a space-time domain. The discrete representation of this
domain may in 3D be a spatial mesh of :math:`M^3` points and a time mesh of
:math:`N_t` points.  In many applications, it is quite typical that :math:`M` is
at least 100, or even 1000.  Storing all the computed :math:`u` values, like
we have done in the programs so far, would demand storing arrays of
size up to :math:`M^3N_t`. This would give a factor of :math:`M^3` larger storage
demands compared to what was required by our ODE programs. Each real
number in the ``u`` array requires 8 bytes (b) of storage. With :math:`M=100`
and :math:`N_t=1000`, there is a storage demand of :math:`(10^3)^3\cdot 1000\cdot
8 = 8` Gb for the solution array.  Fortunately, we can usually get rid
of the :math:`N_t` factor, resulting in 8 Mb of storage.  Below we explain
how this is done (the technique is almost always applied in
implementations of PDE problems).

Let us critically evaluate how much we really need to store in the
computer's memory for our implementation of the :math:`\theta` method. To
compute a new :math:`u^{n+1}`, all we need is :math:`u^n`. This implies that the
previous :math:`u^{n-1},u^{n-2},\dots,u^0` values do not need to be stored,
although this is convenient for plotting and data analysis in the
program.  Instead of the ``u`` array we can work with two variables for
real numbers, ``u`` and ``u_1``, representing :math:`u^{n+1}` and :math:`u^n` in the
algorithm, respectively.  At each time level, we update ``u`` from ``u_1``
and then set ``u_1 = u``, so that the computed :math:`u^{n+1}` value becomes
the "previous" value :math:`u^n` at the next time level. The downside is
that we cannot plot the solution after the simulation is done since
only the last two numbers are available.  The remedy is to store
computed values in a file and use the file for visualizing the
solution later.

We have implemented this memory saving idea in the file
`decay_memsave.py <http://tinyurl.com/ofkw6kc/alg/decay_memsave.py>`__, which is a
slight modification of `decay_plot_mpl.py <http://tinyurl.com/ofkw6kc/alg/decay_plot_mpl.py>`__ program.

The following function demonstrates how we work with the two most
recent values of the unknown:

.. code-block:: python

        def solver_memsave(I, a, T, dt, theta, filename='sol.dat'):
            """
            Solve u'=-a*u, u(0)=I, for t in (0,T] with steps of dt.
            Minimum use of memory. The solution is stored in a file
            (with name filename) for later plotting.
            """
            dt = float(dt)         # avoid integer division
            Nt = int(round(T/dt))  # no of intervals
        
            outfile = open(filename, 'w')
            # u: time level n+1, u_1: time level n
            t = 0
            u_1 = I
            outfile.write('%.16E  %.16E\n' % (t, u_1))
            for n in range(1, Nt+1):
                u = (1 - (1-theta)*a*dt)/(1 + theta*dt*a)*u_1
                u_1 = u
                t += dt
                outfile.write('%.16E  %.16E\n' % (t, u))
            outfile.close()
            return u, t

This code snippet also serves as a quick introduction to file writing in Python.
Reading the data in the file into arrays ``t`` and ``u`` is done by the
function

.. code-block:: python

        def read_file(filename='sol.dat'):
            infile = open(filename, 'r')
            u = [];  t = []
            for line in infile:
                words = line.split()
                if len(words) != 2:
                    print 'Found more than two numbers on a line!', words
                    sys.exit(1)  # abort
                t.append(float(words[0]))
                u.append(float(words[1]))
            return np.array(t), np.array(u)

This type of file with numbers in rows and columns is very common, and
``numpy`` has a function ``loadtxt`` which loads such tabular data into a
two-dimensional array named by the user. Say the name is ``data``, the
number in row ``i`` and column ``j`` is then ``data[i,j]``.  The whole
column number ``j`` can be extracted by ``data[:,j]``.  A version of
``read_file`` using ``np.loadtxt`` reads

.. code-block:: python

        def read_file_numpy(filename='sol.dat'):
            data = np.loadtxt(filename)
            t = data[:,0]
            u = data[:,1]
            return t, u

The present counterpart to the ``explore`` function from
`decay_plot_mpl.py <http://tinyurl.com/ofkw6kc/alg/decay_plot_mpl.py>`__ must run
``solver_memsave`` and then load data from file before we can compute
the error measure and make the plot:

.. code-block:: python

        def explore(I, a, T, dt, theta=0.5, makeplot=True):
            filename = 'u.dat'
            u, t = solver_memsave(I, a, T, dt, theta, filename)
        
            t, u = read_file(filename)
            u_e = u_exact(t, I, a)
            e = u_e - u
            E = sqrt(dt*np.sum(e**2))
            if makeplot:
                figure()
                ...

Apart from the internal implementation, where :math:`u^n` values are stored
in a file rather than in an array, ``decay_memsave.py`` file works
exactly as the ``decay_plot_mpl.py`` file.