.. !split

Tests for verifying implementations
===================================

Any module with functions should have a set of tests that can
check the
correctness of the implementations.
There exists
well-established procedures and corresponding tools for automating
the execution of such tests. These tools allow large test sets to be
run with a one-line command, making it easy to check of the
still software works (as far as the
tests tell!). Here we shall illustrate two important
software testing techniques: *doctest* and *unit testing*.
The first one is Python specific, while unit testing is the dominating
test technique in the software industry today.

Doctests
--------

.. index:: doctests

.. index::
   single: software testing; doctests

A doc string, the first string after the function header, is used to
document the purpose of functions and their arguments
(see the section :ref:`softeng1:basic:func`). Very often it
is instructive to include an example in the doc string
on how to use the function.
Interactive examples in the Python shell are most illustrative as
we can see the output resulting from the statements and expressions.
For example,
in the ``solver`` function, we can include an example on calling
this function and printing the computed ``u`` and ``t`` arrays:

.. code-block:: python

        def solver(I, a, T, dt, theta):
            """
            Solve u'=-a*u, u(0)=I, for t in (0,T] with steps of dt.
        
        
            >>> u, t = solver(I=0.8, a=1.2, T=1.5, dt=0.5, theta=0.5)
            >>> for n in range(len(t)):
            ...     print 't=%.1f, u=%.14f' % (t[n], u[n])
            t=0.0, u=0.80000000000000
            t=0.5, u=0.43076923076923
            t=1.0, u=0.23195266272189
            t=1.5, u=0.12489758761948
            """
            ...

When such interactive demonstrations are inserted in doc strings,
Python's `doctest <http://docs.python.org/library/doctest.html>`__
module can be used to automate running all commands
in interactive sessions and compare new output with the output
appearing in the doc string.  All we have to do in the current example
is to run the module file ``decay.py`` with

.. code-block:: python

        Terminal> python -m doctest decay.py

This command imports the ``doctest`` module, which runs all
doctests found in the file and reports discrepancies between
expected and computed output.
Alternatively, the test block in a module may run all doctests
by

.. code-block:: python

        if __name__ == '__main__':
            import doctest
            doctest.testmod()

Doctests can also be embedded in nose/pytest unit tests
as explained in the next section.


.. admonition:: Doctests prevent command-line arguments

   No additional command-line argument is allowed when running doctests.
   If your program relies on command-line input, make sure the doctests
   can be run *without* such input on the command line.
   
   However, you can simulate command-line input by filling ``sys.argv``
   with values, e.g.,
   
   .. code-block:: python
   
           import sys; sys.argv = '--I 1.0 --a 5'.split()


The execution command above will report any problem if a test fails.
As an illustration, let us alter the ``u`` value at ``t=1.5`` in
the output of the doctest by replacing the last digit ``8`` by ``7``.
This edit triggers a report:

.. code-block:: text

        Terminal> python -m doctest decay.py
        ********************************************************
        File "decay.py", line ...
        Failed example:
            for n in range(len(t)):
                print 't=%.1f, u=%.14f' % (t[n], u[n])
        Expected:
            t=0.0, u=0.80000000000000
            t=0.5, u=0.43076923076923
            t=1.0, u=0.23195266272189
            t=1.5, u=0.12489758761948
        Got:
            t=0.0, u=0.80000000000000
            t=0.5, u=0.43076923076923
            t=1.0, u=0.23195266272189
            t=1.5, u=0.12489758761947


.. admonition:: Pay attention to the number of digits in doctest results

   Note that in the output of ``t`` and ``u`` we write ``u`` with 14 digits.
   Writing all 16 digits is not a good idea: if the tests are run on
   different hardware, round-off errors might be different, and
   the ``doctest`` module detects that the numbers are not precisely the same
   and reports failures. In the present application, where :math:`0 < u(t) \leq 0.8`,
   we expect round-off errors to be of size :math:`10^{-16}`, so comparing 15
   digits would probably be reliable, but we compare 14 to be on the
   safe side. On the other hand, comparing a small number of digits may
   hide software errors.


Doctests are highly encouraged as they do two things: 1) demonstrate
how a function is used and 2) test that the function works.

Unit tests and test functions
-----------------------------

.. index:: nose tests

.. index:: pytest tests

.. index:: unit testing

.. index::
   single: software testing; nose

.. index::
   single: software testing; pytest

The unit testing technique consists of identifying smaller units
of code and writing one or more tests for
each unit. One unit can typically be a function.
Each test should, ideally, not depend on the outcome of
other tests. The recommended practice is actually to
design and write the unit tests first and *then* implement the functions!

In scientific computing it is not always obvious how to best perform
unit testing. The units are naturally larger than in non-scientific
software. Very often the solution procedure of a mathematical problem
identifies a unit, such as our ``solver`` function.

.. index:: test function

.. index::
   single: software testing; test function

Two Python test frameworks: nose and pytest
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Python offers two very easy-to-use software frameworks for implementing
unit tests: nose and pytest. These work (almost) in the same way,
but our recommendation is to go for pytest.

Test function requirements
~~~~~~~~~~~~~~~~~~~~~~~~~~

For a test to qualify as a *test function* in nose or pytest, three
rules must be followed:

 1. The function name must start with ``test_``.

 2. Function arguments are not allowed.

 3. An ``AssertionError`` exception must be raised if the test fails.

A specific example might be illustrative before proceeding.
We have the following function that we want to test:

.. code-block:: python

        def double(n):
            return 2*n

The corresponding test function could, in principle, have been written
as

.. code-block:: python

        def test_double():
            """Test that double(n) works for one specific n."""
            n = 4
            expected = 2*4
            computed = double(4)
            if expected != computed:
                raise AssertionError

The last two lines, however, are never written like this in test functions.
Instead, Python's ``assert`` statement is used: ``assert success, msg``, where
``success`` is a boolean variable, which is ``False`` if the test fails, and
``msg`` is *an optional* message string that is printed when the test fails.
A better version of the test function is therefore

.. code-block:: python

        def test_double():
            """Test that double(n) works for one specific n."""
            n = 4
            expected = 2*4
            computed = double(4)
            msg = 'expected %g, computed %g' % (expected, computed)
            success = expected == computed
            assert success, msg

Comparison of real numbers
~~~~~~~~~~~~~~~~~~~~~~~~~~

Because of the finite precision arithmetics on a computer, which gives
rise to round-off errors, the ``==`` operator is not suitable for
checking whether two real numbers are equal. Obviously, this principle
also applies to tests in test functions.
We must therefore replace ``a == b`` by a comparison
based on a tolerance ``tol``: ``abs(a-b) < tol``. The next example illustrates
the problem and its solution.

Here is a slightly different function that
we want to test:

.. code-block:: python

        def third(x):
            return x/3.

We write a test function where the expected result is computed as
:math:`\frac{1}{3}x` rather than :math:`x/3`:

.. code-block:: python

        def test_third():
            """Check that third(x) works for many x values."""
            for x in np.linspace(0, 1, 21):
                expected = (1/3.0)*x
                computed = third(x)
                success = expected == computed
                assert success

This ``test_third`` function executes silently, i.e., no failure,
until ``x`` becomes 0.15. Then round-off errors make the ``==`` comparison
``False``. In fact, seven of the ``x`` values above face this problem.
The solution is to compare ``expected`` and ``computed``
with a small tolerance:

.. code-block:: python

        def test_third():
            """Check that third(x) works for many x values."""
            for x in np.linspace(0, 1, 21):
                expected = (1/3.)*x
                computed = third(x)
                tol = 1E-15
                success = abs(expected - computed) < tol
                assert success


.. admonition:: Always compare real numbers with a tolerance

   Real numbers should never be compared with the ``==`` operator, but always
   with the absolute value of the difference and a tolerance.
   So, replace ``a == b``, if ``a`` and/or ``b`` is ``float``, by
   
   .. code-block:: python
   
           tol = 1E-14
           abs(a - b) < tol
   
   The suitable size of ``tol`` depends on the size of ``a`` and ``b``
   (see :ref:`softeng1:exer:tol`).


Special assert functions from nose
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test frameworks often contain more tailored
*assert functions* that can be called instead of using the ``assert``
statement. For example, comparing two objects within
a tolerance, as in the present
case, can be done by the ``assert_almost_equal`` from the nose
framework:

.. code-block:: python

        import nose.tools as nt
        
        def test_third():
            x = 0.15
            expected = (1/3.)*x
            computed = third(x)
            nt.assert_almost_equal(
                expected, computed, delta=1E-15,
                msg='diff=%.17E' % (expected - computed))

Whether to use the plain ``assert`` statement with a comparison based on
a tolerance or to use the ready-made function ``assert_almost_equal``
depends on the programmer's preference. The examples used in the
documentation of the pytest framework stick to the plain ``assert``
statement.

Locating test functions
~~~~~~~~~~~~~~~~~~~~~~~

Test functions can reside in a module together with the functions they
are supposed to verify, or the test functions can be collected in
separate files having names starting with ``test``. Actually,
nose and pytest can recursively run all test functions
in all ``test*.py``
files in the current directory, as well as in all subdirectories!

The `decay.py <http://tinyurl.com/nm5587k/softeng1/decay.py>`__ module file features
test functions in the module, but we could equally well have made
a subdirectory ``tests`` and put the test functions in
`tests/test_decay.py <http://tinyurl.com/nm5587k/softeng1/tests/test_decay.py>`__.

Running tests
~~~~~~~~~~~~~

To run all test functions in the file ``decay.py`` do

.. code-block:: text

        Terminal> nosetests -s -v decay.py
        Terminal> py.test -s -v decay.py

The ``-s`` option ensures that output from the test functions is printed
in the terminal window, while ``-v`` prints the outcome of each individual
test function.

Alternatively, if the test functions are located in some separate
``test*.py`` files,
we can just write

.. code-block:: text

        Terminal> py.test -s -v

to *recursively* run *all* test functions in the current
directory tree. The corresponding

.. code-block:: text

        Terminal> nosetests -s -v

command does the same, but requires subdirectory names to start
with ``test`` or end with ``_test`` or ``_tests`` (which is a good habit anyway).
An example of a ``tests`` directory with a ``test*.py``
file is found in `src/softeng1/tests <http://tinyurl.com/nm5587k/softeng1/tests>`__.

.. index:: doctest in test function

Embedding doctests in a test function
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Doctests can also be executed from nose/pytest unit tests. Here
is an example of a file `test_decay_doctest.py <http://tinyurl.com/nm5587k/softeng1/tests/test_decay_doctest.py>`__ where we in the test block run all the doctests
in the imported module ``decay``, but we also include a local test function that
does the same:

.. code-block:: python

        import sys, os
        sys.path.insert(0, os.pardir)
        import decay
        import doctest
        
        def test_decay_module_with_doctest():
            """Doctest embedded in a nose/pytest unit test."""
            # Test all functions with doctest in module decay
            failure_count, test_count = doctest.testmod(m=decay)
            assert failure_count == 0
        
        if __name__ == '__main__':
            # Run all functions with doctests in this module
            failure_count, test_count = doctest.testmod(m=decay)

Running this file as a program from the command line
triggers the ``doctest.testmod`` call
in the test block, while applying ``py.test`` or ``nosetests`` to the file triggers
an import of the file and execution of the test function
``test_decay_modue_with_doctest``.

Installing nose and pytest
~~~~~~~~~~~~~~~~~~~~~~~~~~

With ``pip`` available, it is trivial to install nose and/or pytest:
``sudo pip install nose`` and ``sudo pip install pytest``.

Test function for the solver
----------------------------

Finding good test problems for verifying the implementation of numerical
methods is a topic on its own. The challenge is that we very seldom know
what the numerical errors are. For the present model problem
:ref:`(3.1) <Eq:softeng1:ode>`-:ref:`(3.2) <Eq:softeng1:u0>` solved by
:ref:`(3.3) <Eq:softeng1:utheta>` one can, fortunately, derive a formula for
the numerical approximation:

.. math::
         u^n = I\left(
        \frac{1 - (1-\theta) a\Delta t}{1 + \theta a \Delta t}
        \right)^n{\thinspace .}

Then we know that the implementation should
produce numbers that agree with this formula to machine precision.
The formula for :math:`u^n` is known as an *exact discrete solution* of the
problem and can be coded as

.. code-block:: python

        def exact_discrete_solution(n, I, a, theta, dt):
            """Return exact discrete solution of the numerical schemes."""
            dt = float(dt)  # avoid integer division
            A = (1 - (1-theta)*a*dt)/(1 + theta*dt*a)
            return I*A**n

A test function can evaluate this solution on a time mesh
and check that the ``u`` values produced by the ``solver`` function
do not deviate with more than a small tolerance:

.. code-block:: python

        def test_exact_discrete_solution():
            """Check that solver reproduces the exact discr. sol."""
            theta = 0.8; a = 2; I = 0.1; dt = 0.8
            Nt = int(8/dt)  # no of steps
            u, t = solver(I=I, a=a, T=Nt*dt, dt=dt, theta=theta)
        
            # Evaluate exact discrete solution on the mesh
            u_de = np.array([exact_discrete_solution(n, I, a, theta, dt)
                             for n in range(Nt+1)])
        
            # Find largest deviation
            diff = np.abs(u_de - u).max()
            tol = 1E-14
            success = diff < tol
            assert success

Among important things to consider when constructing test functions
is testing the effect of wrong input to the function being tested.
In our ``solver`` function, for example, integer values of :math:`a`, :math:`\Delta t`, and
:math:`\theta` may cause unintended integer
division. We should therefore add a test to make sure our ``solver``
function does not fall into this potential trap:

.. code-block:: python

        def test_potential_integer_division():
            """Choose variables that can trigger integer division."""
            theta = 1; a = 1; I = 1; dt = 2
            Nt = 4
            u, t = solver(I=I, a=a, T=Nt*dt, dt=dt, theta=theta)
            u_de = np.array([exact_discrete_solution(n, I, a, theta, dt)
                             for n in range(Nt+1)])
            diff = np.abs(u_de - u).max()
            assert diff < 1E-14

Test function for reading positional command-line arguments
-----------------------------------------------------------

The function ``read_command_line_positional`` extracts numbers from the
command line. To test it, we must decide on a set of values for
the input data, fill ``sys.argv``
accordingly, and check that we get the expected values:

.. code-block:: python

        def test_read_command_line_positional():
            # Decide on a data set of input parameters
            I = 1.6;  a = 1.8;  T = 2.2;  theta = 0.5
            dt_values = [0.1, 0.2, 0.05]
            # Expected return from read_command_line_positional
            expected = [I, a, T, theta, dt_values]
            # Construct corresponding sys.argv array
            sys.argv = [sys.argv[0], str(I), str(a), str(T), 'CN'] + \ 
                       [str(dt) for dt in dt_values]
            computed = read_command_line_positional()
            for expected_arg, computed_arg in zip(expected, computed):
                assert expected_arg == computed_arg

Note that ``sys.argv[0]`` is always the program name and that we have to
copy that string from the original ``sys.argv`` array to the new one we
construct in the test function. (Actually, this test function destroys
the original ``sys.argv`` that Python fetched from the command line.)

Any numerical code writer should always be skeptical to the use of the exact
equality operator ``==`` in test functions, since round-off errors often
come into play. Here, however, we set some real values, convert them
to strings and convert back again to real numbers (of the same precision).
This string-number conversion does not involve any finite precision
arithmetics effects so we
can safely use ``==`` in tests. Note also that the last element in
``expected`` and ``computed`` is the list ``dt_values``, and ``==`` works
for comparing two lists as well.

Test function for reading option-value pairs
--------------------------------------------

The function ``read_command_line_argparse`` can be verified with a
test function that has the same setup as ``test_read_command_line_positional``
above.
However, the construction of the command line is a bit more complicated.
We find it convenient to construct the line as a string and then
split the line into words to get the desired list ``sys.argv``:

.. code-block:: python

        def test_read_command_line_argparse():
            I = 1.6;  a = 1.8;  T = 2.2;  theta = 0.5
            dt_values = [0.1, 0.2, 0.05]
            # Expected return from read_command_line_argparse
            expected = [I, a, T, theta, dt_values]
            # Construct corresponding sys.argv array
            command_line = '%s --a %s --I %s --T %s --scheme CN --dt ' % \ 
                           (sys.argv[0], a, I, T)
            command_line += ' '.join([str(dt) for dt in dt_values])
            sys.argv = command_line.split()
            computed = read_command_line_argparse()
            for expected_arg, computed_arg in zip(expected, computed):
                assert expected_arg == computed_arg

Recall that the Python function ``zip`` enables iteration over
several lists, tuples, or arrays at the same time.


.. admonition:: Let silent test functions speak up during development

   When you develop test functions in a module, it is common to use IPython
   for interactive experimentation:
   
   .. code-block:: ipy
   
           In[1]: import decay
           
           In[2]: decay.test_read_command_line_argparse()
   
   Note that a working test function is completely silent! Many
   find it psychologically annoying to convince themselves that a
   completely silent function is doing the right things. It can therefore,
   during development of a test function, be convenient to insert
   print statements in the function to monitor that the function body
   is indeed executed. For example, one can print the expected and
   computed values in the terminal window:
   
   .. code-block:: python
   
           def test_read_command_line_argparse():
               ...
               for expected_arg, computed_arg in zip(expected, computed):
                   print expected_arg, computed_arg
                   assert expected_arg == computed_arg
   
   After performing this edit, we want to run the test again, but
   in IPython the module must first be reloaded (reimported):
   
   .. code-block:: ipy
   
           In[3]: reload(decay)  # force new import
           
           In[2]: decay.test_read_command_line_argparse()
           1.6 1.6
           1.8 1.8
           2.2 2.2
           0.5 0.5
           [0.1, 0.2, 0.05] [0.1, 0.2, 0.05]
   
   Now we clearly see the objects that are compared.


.. _softeng1:basic:unittest:

Classical class-based unit testing
----------------------------------

.. index:: unit testing

.. index:: unittest

.. index::
   single: software testing; unit testing (class-based)

The test functions written for the nose and pytest frameworks are
very straightforward and to the point, with no framework-required boilerplate
code. We just write the statements we need to get the computations and
comparisons done, before applying the required ``assert``.

The classical way of implementing unit tests (which derives from the
JUnit object-oriented tool in Java) leads to much more comprehensive
implementations with a lot of boilerplate code.  Python comes with a
built-in module ``unittest`` for doing this type of classical unit
tests. Although nose or pytest are much more convenient to use than
``unittest``, class-based unit testing in the style of ``unittest`` has a
very strong position in computer science and is so widespread in
the software industry that
even computational scientists should have an idea how such unit test
code is written. A short demo of ``unittest`` is therefore included
next. (Readers who are not familiar with object-oriented programming
in Python may find the text hard to understand, but one can safely
jump to the next section.)

.. index:: unittest

.. index:: TestCase (class in unittest)

Suppose we have a function ``double(x)`` in a module file ``mymod.py``:

.. code-block:: python

        def double(x):
            return 2*x

Unit testing with the aid of the ``unittest`` module
consists of writing a file ``test_mymod.py`` for testing the functions
in ``mymod.py``. The individual tests must be methods with names
starting with ``test_`` in a class derived from class ``TestCase`` in
``unittest``. With one test method for the function ``double``, the
``test_mymod.py`` file becomes

.. code-block:: python

        import unittest
        import mymod
        
        class TestMyCode(unittest.TestCase):
            def test_double(self):
                x = 4
                expected = 2*x
                computed = mymod.double(x)
                self.assertEqual(expected, computed)
        
        if __name__ == '__main__':
            unittest.main()

The test is run by executing the test file ``test_mymod.py`` as a standard
Python program. There is no support in ``unittest`` for automatically
locating and running all tests in all test files in a directory tree.

We could use the basic ``assert`` statement as we did with nose and pytest
functions, but those who write code based on ``unittest`` almost
exclusively use the wide range of built-in assert functions such
as ``assertEqual``, ``assertNotEqual``, ``assertAlmostEqual``, to mention
some of them.

Translation of the test functions from the previous sections
to ``unittest`` means making a new file ``test_decay.py`` file with a
test class ``TestDecay`` where the stand-alone functions for
nose/pytest now become methods in this class.

.. code-block:: python

        import unittest
        import decay
        import numpy as np
        
        def exact_discrete_solution(n, I, a, theta, dt):
            ...
        
        class TestDecay(unittest.TestCase):
        
            def test_exact_discrete_solution(self):
                theta = 0.8; a = 2; I = 0.1; dt = 0.8
                Nt = int(8/dt)  # no of steps
                u, t = decay.solver(I=I, a=a, T=Nt*dt, dt=dt, theta=theta)
                # Evaluate exact discrete solution on the mesh
                u_de = np.array([exact_discrete_solution(n, I, a, theta, dt)
                                 for n in range(Nt+1)])
                diff = np.abs(u_de - u).max()  # largest deviation
                self.assertAlmostEqual(diff, 0, delta=1E-14)
        
            def test_potential_integer_division(self):
                ...
                self.assertAlmostEqual(diff, 0, delta=1E-14)
        
            def test_read_command_line_positional(self):
                ...
                for expected_arg, computed_arg in zip(expected, computed):
                    self.assertEqual(expected_arg, computed_arg)
        
            def test_read_command_line_argparse(self):
                ...
        
        if __name__ == '__main__':
            unittest.main()