.. !split Tests for verifying implementations =================================== Any module with functions should have a set of tests that can check the correctness of the implementations. There exists well-established procedures and corresponding tools for automating the execution of such tests. These tools allow large test sets to be run with a one-line command, making it easy to check of the still software works (as far as the tests tell!). Here we shall illustrate two important software testing techniques: *doctest* and *unit testing*. The first one is Python specific, while unit testing is the dominating test technique in the software industry today. Doctests -------- .. index:: doctests .. index:: single: software testing; doctests A doc string, the first string after the function header, is used to document the purpose of functions and their arguments (see the section :ref:`softeng1:basic:func`). Very often it is instructive to include an example in the doc string on how to use the function. Interactive examples in the Python shell are most illustrative as we can see the output resulting from the statements and expressions. For example, in the ``solver`` function, we can include an example on calling this function and printing the computed ``u`` and ``t`` arrays: .. code-block:: python def solver(I, a, T, dt, theta): """ Solve u'=-a*u, u(0)=I, for t in (0,T] with steps of dt. >>> u, t = solver(I=0.8, a=1.2, T=1.5, dt=0.5, theta=0.5) >>> for n in range(len(t)): ... print 't=%.1f, u=%.14f' % (t[n], u[n]) t=0.0, u=0.80000000000000 t=0.5, u=0.43076923076923 t=1.0, u=0.23195266272189 t=1.5, u=0.12489758761948 """ ... When such interactive demonstrations are inserted in doc strings, Python's `doctest `__ module can be used to automate running all commands in interactive sessions and compare new output with the output appearing in the doc string. All we have to do in the current example is to run the module file ``decay.py`` with .. code-block:: python Terminal> python -m doctest decay.py This command imports the ``doctest`` module, which runs all doctests found in the file and reports discrepancies between expected and computed output. Alternatively, the test block in a module may run all doctests by .. code-block:: python if __name__ == '__main__': import doctest doctest.testmod() Doctests can also be embedded in nose/pytest unit tests as explained in the next section. .. admonition:: Doctests prevent command-line arguments No additional command-line argument is allowed when running doctests. If your program relies on command-line input, make sure the doctests can be run *without* such input on the command line. However, you can simulate command-line input by filling ``sys.argv`` with values, e.g., .. code-block:: python import sys; sys.argv = '--I 1.0 --a 5'.split() The execution command above will report any problem if a test fails. As an illustration, let us alter the ``u`` value at ``t=1.5`` in the output of the doctest by replacing the last digit ``8`` by ``7``. This edit triggers a report: .. code-block:: text Terminal> python -m doctest decay.py ******************************************************** File "decay.py", line ... Failed example: for n in range(len(t)): print 't=%.1f, u=%.14f' % (t[n], u[n]) Expected: t=0.0, u=0.80000000000000 t=0.5, u=0.43076923076923 t=1.0, u=0.23195266272189 t=1.5, u=0.12489758761948 Got: t=0.0, u=0.80000000000000 t=0.5, u=0.43076923076923 t=1.0, u=0.23195266272189 t=1.5, u=0.12489758761947 .. admonition:: Pay attention to the number of digits in doctest results Note that in the output of ``t`` and ``u`` we write ``u`` with 14 digits. Writing all 16 digits is not a good idea: if the tests are run on different hardware, round-off errors might be different, and the ``doctest`` module detects that the numbers are not precisely the same and reports failures. In the present application, where :math:`0 < u(t) \leq 0.8`, we expect round-off errors to be of size :math:`10^{-16}`, so comparing 15 digits would probably be reliable, but we compare 14 to be on the safe side. On the other hand, comparing a small number of digits may hide software errors. Doctests are highly encouraged as they do two things: 1) demonstrate how a function is used and 2) test that the function works. Unit tests and test functions ----------------------------- .. index:: nose tests .. index:: pytest tests .. index:: unit testing .. index:: single: software testing; nose .. index:: single: software testing; pytest The unit testing technique consists of identifying smaller units of code and writing one or more tests for each unit. One unit can typically be a function. Each test should, ideally, not depend on the outcome of other tests. The recommended practice is actually to design and write the unit tests first and *then* implement the functions! In scientific computing it is not always obvious how to best perform unit testing. The units are naturally larger than in non-scientific software. Very often the solution procedure of a mathematical problem identifies a unit, such as our ``solver`` function. .. index:: test function .. index:: single: software testing; test function Two Python test frameworks: nose and pytest ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Python offers two very easy-to-use software frameworks for implementing unit tests: nose and pytest. These work (almost) in the same way, but our recommendation is to go for pytest. Test function requirements ~~~~~~~~~~~~~~~~~~~~~~~~~~ For a test to qualify as a *test function* in nose or pytest, three rules must be followed: 1. The function name must start with ``test_``. 2. Function arguments are not allowed. 3. An ``AssertionError`` exception must be raised if the test fails. A specific example might be illustrative before proceeding. We have the following function that we want to test: .. code-block:: python def double(n): return 2*n The corresponding test function could, in principle, have been written as .. code-block:: python def test_double(): """Test that double(n) works for one specific n.""" n = 4 expected = 2*4 computed = double(4) if expected != computed: raise AssertionError The last two lines, however, are never written like this in test functions. Instead, Python's ``assert`` statement is used: ``assert success, msg``, where ``success`` is a boolean variable, which is ``False`` if the test fails, and ``msg`` is *an optional* message string that is printed when the test fails. A better version of the test function is therefore .. code-block:: python def test_double(): """Test that double(n) works for one specific n.""" n = 4 expected = 2*4 computed = double(4) msg = 'expected %g, computed %g' % (expected, computed) success = expected == computed assert success, msg Comparison of real numbers ~~~~~~~~~~~~~~~~~~~~~~~~~~ Because of the finite precision arithmetics on a computer, which gives rise to round-off errors, the ``==`` operator is not suitable for checking whether two real numbers are equal. Obviously, this principle also applies to tests in test functions. We must therefore replace ``a == b`` by a comparison based on a tolerance ``tol``: ``abs(a-b) < tol``. The next example illustrates the problem and its solution. Here is a slightly different function that we want to test: .. code-block:: python def third(x): return x/3. We write a test function where the expected result is computed as :math:`\frac{1}{3}x` rather than :math:`x/3`: .. code-block:: python def test_third(): """Check that third(x) works for many x values.""" for x in np.linspace(0, 1, 21): expected = (1/3.0)*x computed = third(x) success = expected == computed assert success This ``test_third`` function executes silently, i.e., no failure, until ``x`` becomes 0.15. Then round-off errors make the ``==`` comparison ``False``. In fact, seven of the ``x`` values above face this problem. The solution is to compare ``expected`` and ``computed`` with a small tolerance: .. code-block:: python def test_third(): """Check that third(x) works for many x values.""" for x in np.linspace(0, 1, 21): expected = (1/3.)*x computed = third(x) tol = 1E-15 success = abs(expected - computed) < tol assert success .. admonition:: Always compare real numbers with a tolerance Real numbers should never be compared with the ``==`` operator, but always with the absolute value of the difference and a tolerance. So, replace ``a == b``, if ``a`` and/or ``b`` is ``float``, by .. code-block:: python tol = 1E-14 abs(a - b) < tol The suitable size of ``tol`` depends on the size of ``a`` and ``b`` (see :ref:`softeng1:exer:tol`). Special assert functions from nose ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Test frameworks often contain more tailored *assert functions* that can be called instead of using the ``assert`` statement. For example, comparing two objects within a tolerance, as in the present case, can be done by the ``assert_almost_equal`` from the nose framework: .. code-block:: python import nose.tools as nt def test_third(): x = 0.15 expected = (1/3.)*x computed = third(x) nt.assert_almost_equal( expected, computed, delta=1E-15, msg='diff=%.17E' % (expected - computed)) Whether to use the plain ``assert`` statement with a comparison based on a tolerance or to use the ready-made function ``assert_almost_equal`` depends on the programmer's preference. The examples used in the documentation of the pytest framework stick to the plain ``assert`` statement. Locating test functions ~~~~~~~~~~~~~~~~~~~~~~~ Test functions can reside in a module together with the functions they are supposed to verify, or the test functions can be collected in separate files having names starting with ``test``. Actually, nose and pytest can recursively run all test functions in all ``test*.py`` files in the current directory, as well as in all subdirectories! The `decay.py `__ module file features test functions in the module, but we could equally well have made a subdirectory ``tests`` and put the test functions in `tests/test_decay.py `__. Running tests ~~~~~~~~~~~~~ To run all test functions in the file ``decay.py`` do .. code-block:: text Terminal> nosetests -s -v decay.py Terminal> py.test -s -v decay.py The ``-s`` option ensures that output from the test functions is printed in the terminal window, while ``-v`` prints the outcome of each individual test function. Alternatively, if the test functions are located in some separate ``test*.py`` files, we can just write .. code-block:: text Terminal> py.test -s -v to *recursively* run *all* test functions in the current directory tree. The corresponding .. code-block:: text Terminal> nosetests -s -v command does the same, but requires subdirectory names to start with ``test`` or end with ``_test`` or ``_tests`` (which is a good habit anyway). An example of a ``tests`` directory with a ``test*.py`` file is found in `src/softeng1/tests `__. .. index:: doctest in test function Embedding doctests in a test function ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Doctests can also be executed from nose/pytest unit tests. Here is an example of a file `test_decay_doctest.py `__ where we in the test block run all the doctests in the imported module ``decay``, but we also include a local test function that does the same: .. code-block:: python import sys, os sys.path.insert(0, os.pardir) import decay import doctest def test_decay_module_with_doctest(): """Doctest embedded in a nose/pytest unit test.""" # Test all functions with doctest in module decay failure_count, test_count = doctest.testmod(m=decay) assert failure_count == 0 if __name__ == '__main__': # Run all functions with doctests in this module failure_count, test_count = doctest.testmod(m=decay) Running this file as a program from the command line triggers the ``doctest.testmod`` call in the test block, while applying ``py.test`` or ``nosetests`` to the file triggers an import of the file and execution of the test function ``test_decay_modue_with_doctest``. Installing nose and pytest ~~~~~~~~~~~~~~~~~~~~~~~~~~ With ``pip`` available, it is trivial to install nose and/or pytest: ``sudo pip install nose`` and ``sudo pip install pytest``. Test function for the solver ---------------------------- Finding good test problems for verifying the implementation of numerical methods is a topic on its own. The challenge is that we very seldom know what the numerical errors are. For the present model problem :ref:`(3.1) `-:ref:`(3.2) ` solved by :ref:`(3.3) ` one can, fortunately, derive a formula for the numerical approximation: .. math:: u^n = I\left( \frac{1 - (1-\theta) a\Delta t}{1 + \theta a \Delta t} \right)^n{\thinspace .} Then we know that the implementation should produce numbers that agree with this formula to machine precision. The formula for :math:`u^n` is known as an *exact discrete solution* of the problem and can be coded as .. code-block:: python def exact_discrete_solution(n, I, a, theta, dt): """Return exact discrete solution of the numerical schemes.""" dt = float(dt) # avoid integer division A = (1 - (1-theta)*a*dt)/(1 + theta*dt*a) return I*A**n A test function can evaluate this solution on a time mesh and check that the ``u`` values produced by the ``solver`` function do not deviate with more than a small tolerance: .. code-block:: python def test_exact_discrete_solution(): """Check that solver reproduces the exact discr. sol.""" theta = 0.8; a = 2; I = 0.1; dt = 0.8 Nt = int(8/dt) # no of steps u, t = solver(I=I, a=a, T=Nt*dt, dt=dt, theta=theta) # Evaluate exact discrete solution on the mesh u_de = np.array([exact_discrete_solution(n, I, a, theta, dt) for n in range(Nt+1)]) # Find largest deviation diff = np.abs(u_de - u).max() tol = 1E-14 success = diff < tol assert success Among important things to consider when constructing test functions is testing the effect of wrong input to the function being tested. In our ``solver`` function, for example, integer values of :math:`a`, :math:`\Delta t`, and :math:`\theta` may cause unintended integer division. We should therefore add a test to make sure our ``solver`` function does not fall into this potential trap: .. code-block:: python def test_potential_integer_division(): """Choose variables that can trigger integer division.""" theta = 1; a = 1; I = 1; dt = 2 Nt = 4 u, t = solver(I=I, a=a, T=Nt*dt, dt=dt, theta=theta) u_de = np.array([exact_discrete_solution(n, I, a, theta, dt) for n in range(Nt+1)]) diff = np.abs(u_de - u).max() assert diff < 1E-14 Test function for reading positional command-line arguments ----------------------------------------------------------- The function ``read_command_line_positional`` extracts numbers from the command line. To test it, we must decide on a set of values for the input data, fill ``sys.argv`` accordingly, and check that we get the expected values: .. code-block:: python def test_read_command_line_positional(): # Decide on a data set of input parameters I = 1.6; a = 1.8; T = 2.2; theta = 0.5 dt_values = [0.1, 0.2, 0.05] # Expected return from read_command_line_positional expected = [I, a, T, theta, dt_values] # Construct corresponding sys.argv array sys.argv = [sys.argv[0], str(I), str(a), str(T), 'CN'] + \ [str(dt) for dt in dt_values] computed = read_command_line_positional() for expected_arg, computed_arg in zip(expected, computed): assert expected_arg == computed_arg Note that ``sys.argv[0]`` is always the program name and that we have to copy that string from the original ``sys.argv`` array to the new one we construct in the test function. (Actually, this test function destroys the original ``sys.argv`` that Python fetched from the command line.) Any numerical code writer should always be skeptical to the use of the exact equality operator ``==`` in test functions, since round-off errors often come into play. Here, however, we set some real values, convert them to strings and convert back again to real numbers (of the same precision). This string-number conversion does not involve any finite precision arithmetics effects so we can safely use ``==`` in tests. Note also that the last element in ``expected`` and ``computed`` is the list ``dt_values``, and ``==`` works for comparing two lists as well. Test function for reading option-value pairs -------------------------------------------- The function ``read_command_line_argparse`` can be verified with a test function that has the same setup as ``test_read_command_line_positional`` above. However, the construction of the command line is a bit more complicated. We find it convenient to construct the line as a string and then split the line into words to get the desired list ``sys.argv``: .. code-block:: python def test_read_command_line_argparse(): I = 1.6; a = 1.8; T = 2.2; theta = 0.5 dt_values = [0.1, 0.2, 0.05] # Expected return from read_command_line_argparse expected = [I, a, T, theta, dt_values] # Construct corresponding sys.argv array command_line = '%s --a %s --I %s --T %s --scheme CN --dt ' % \ (sys.argv[0], a, I, T) command_line += ' '.join([str(dt) for dt in dt_values]) sys.argv = command_line.split() computed = read_command_line_argparse() for expected_arg, computed_arg in zip(expected, computed): assert expected_arg == computed_arg Recall that the Python function ``zip`` enables iteration over several lists, tuples, or arrays at the same time. .. admonition:: Let silent test functions speak up during development When you develop test functions in a module, it is common to use IPython for interactive experimentation: .. code-block:: ipy In[1]: import decay In[2]: decay.test_read_command_line_argparse() Note that a working test function is completely silent! Many find it psychologically annoying to convince themselves that a completely silent function is doing the right things. It can therefore, during development of a test function, be convenient to insert print statements in the function to monitor that the function body is indeed executed. For example, one can print the expected and computed values in the terminal window: .. code-block:: python def test_read_command_line_argparse(): ... for expected_arg, computed_arg in zip(expected, computed): print expected_arg, computed_arg assert expected_arg == computed_arg After performing this edit, we want to run the test again, but in IPython the module must first be reloaded (reimported): .. code-block:: ipy In[3]: reload(decay) # force new import In[2]: decay.test_read_command_line_argparse() 1.6 1.6 1.8 1.8 2.2 2.2 0.5 0.5 [0.1, 0.2, 0.05] [0.1, 0.2, 0.05] Now we clearly see the objects that are compared. .. _softeng1:basic:unittest: Classical class-based unit testing ---------------------------------- .. index:: unit testing .. index:: unittest .. index:: single: software testing; unit testing (class-based) The test functions written for the nose and pytest frameworks are very straightforward and to the point, with no framework-required boilerplate code. We just write the statements we need to get the computations and comparisons done, before applying the required ``assert``. The classical way of implementing unit tests (which derives from the JUnit object-oriented tool in Java) leads to much more comprehensive implementations with a lot of boilerplate code. Python comes with a built-in module ``unittest`` for doing this type of classical unit tests. Although nose or pytest are much more convenient to use than ``unittest``, class-based unit testing in the style of ``unittest`` has a very strong position in computer science and is so widespread in the software industry that even computational scientists should have an idea how such unit test code is written. A short demo of ``unittest`` is therefore included next. (Readers who are not familiar with object-oriented programming in Python may find the text hard to understand, but one can safely jump to the next section.) .. index:: unittest .. index:: TestCase (class in unittest) Suppose we have a function ``double(x)`` in a module file ``mymod.py``: .. code-block:: python def double(x): return 2*x Unit testing with the aid of the ``unittest`` module consists of writing a file ``test_mymod.py`` for testing the functions in ``mymod.py``. The individual tests must be methods with names starting with ``test_`` in a class derived from class ``TestCase`` in ``unittest``. With one test method for the function ``double``, the ``test_mymod.py`` file becomes .. code-block:: python import unittest import mymod class TestMyCode(unittest.TestCase): def test_double(self): x = 4 expected = 2*x computed = mymod.double(x) self.assertEqual(expected, computed) if __name__ == '__main__': unittest.main() The test is run by executing the test file ``test_mymod.py`` as a standard Python program. There is no support in ``unittest`` for automatically locating and running all tests in all test files in a directory tree. We could use the basic ``assert`` statement as we did with nose and pytest functions, but those who write code based on ``unittest`` almost exclusively use the wide range of built-in assert functions such as ``assertEqual``, ``assertNotEqual``, ``assertAlmostEqual``, to mention some of them. Translation of the test functions from the previous sections to ``unittest`` means making a new file ``test_decay.py`` file with a test class ``TestDecay`` where the stand-alone functions for nose/pytest now become methods in this class. .. code-block:: python import unittest import decay import numpy as np def exact_discrete_solution(n, I, a, theta, dt): ... class TestDecay(unittest.TestCase): def test_exact_discrete_solution(self): theta = 0.8; a = 2; I = 0.1; dt = 0.8 Nt = int(8/dt) # no of steps u, t = decay.solver(I=I, a=a, T=Nt*dt, dt=dt, theta=theta) # Evaluate exact discrete solution on the mesh u_de = np.array([exact_discrete_solution(n, I, a, theta, dt) for n in range(Nt+1)]) diff = np.abs(u_de - u).max() # largest deviation self.assertAlmostEqual(diff, 0, delta=1E-14) def test_potential_integer_division(self): ... self.assertAlmostEqual(diff, 0, delta=1E-14) def test_read_command_line_positional(self): ... for expected_arg, computed_arg in zip(expected, computed): self.assertEqual(expected_arg, computed_arg) def test_read_command_line_argparse(self): ... if __name__ == '__main__': unittest.main()