.. !split

.. _basics:basic:objects:

Variables, loops, lists, and arrays
===================================

.. _basics:accesspy:

Getting access to Python
------------------------

Simple mathematical calculations can be done in plain Python, but for
more advanced scientific computing, as we do here, several add-on
packages are needed. Getting all software correctly installed used to
be quite challenging, but today there are several easy-to-use approaches
to get access to Python.

Mac and Windows
~~~~~~~~~~~~~~~

We recommend to download and install
`Anaconda <https://store.continuum.io/cshop/anaconda/>`__,
which is a free distribution of Python that
comes with most of the packages you need for
advanced scientific computing.

Ubuntu
~~~~~~

Debian-based Linux systems, such as Ubuntu, can also use Anaconda,
but it is more common to use the ``apt-get`` command-line tool or the
Ubuntu installer to install a set of Debian packages. Here is a
list of the packages you need to install for this introduction:

.. code-block:: text

    Terminal> sudo apt-get install python-pip python-setuptools \ 
      python-scipy python-sympy python-cython python-matplotlib \ 
      python-dev python-profiler pydb spyder imagemagick gedit vim \ 
      emacs python-mode git mercurial lib-avtools gnome-terminal

In addition, run

.. code-block:: text

    Terminal> pip install nose
    Terminal> pip install pytest
    Terminal> pip install ipython --upgrade
    Terminal> pip install tornado --upgrade
    Terminal> pip install pyzmq --upgrade

Web access
~~~~~~~~~~

You can also access Python directly through a web browser without having
it installed on your local computer. We refer to the document
`How to access Python for doing scientific computing <http://hplgit.github.io/edu/accesspy/accesspy.html>`__ for more details on such tools and also
installations based on Anaconda and Ubuntu.

Mathematical example
--------------------

We shall use a famous mathematical formula in our
forthcoming programming examples.
The formula tells how long distance :math:`s` an object has moved in
a time interval :math:`[0,t]` if it starts
out with a velocity :math:`v_0` and undergoes constant acceleration :math:`a`:

.. _Eq:basics:seq:

.. math::

    \tag{1}
    s = v_0t + \frac{1}{2}at^2\thinspace .
        
        
We may view :math:`s` as a function of :math:`t`: :math:`s(t)`, and also include the
parameters in the notation: :math:`s(t;v_0,a)`.

.. _basics:formula:eval:

A program for evaluating a formula
----------------------------------

.. index:: text editor

.. index:: program file

Here is a Python program for computing :math:`s`, given :math:`t=0.5`, :math:`v_0=2`,
and :math:`a=0.2`:

.. code-block:: python

    t = 0.5
    v0 = 2
    a = 0.2
    s = v0*t + 0.5*a*t**2
    print s

The program is pure text and must be placed in a *pure text file* using
a *text editor*. Popular text editors are
Gedit, Nano, Emacs, and Vim on Linux,
TextWrangler on Mac OS X, and Notepad++ on Windows.
Save the text to a
program file whose name ends in ``.py``, say ``distance.py``.

The program is run in a terminal window (Command Prompt on Windows,
Terminal application on Mac OS X, ``xterm`` or ``gnome-terminal`` on Linux):

.. code-block:: text

    Terminal> python distance.py
    1.025

The result of the ``print`` statement is the number ``1.025`` in the terminal
window.

As an alternative to writing programs in a text editor and executing them
in a terminal window, you may
use the `Spyder <https://code.google.com/p/spyderlib/>`__
graphical interface which gives a more Matlab-style
environment to work in. Anaconda installations come with Spyder.

.. index:: assignment statement

.. index:: print statement

.. index:: variable

The ``distance.py`` program first contains four *assignment statements* where we
assign numbers or the results of arithmetic expressions to *variables*.
The variables have names coinciding with the mathematical notation
used in the formula: ``t``, ``v0``, ``a``, and ``s``. You may think of
variables in this programs just as variables in mathematics.

.. index:: objects

.. index:: float

.. index:: int


.. admonition:: More technical details

   A statement like ``t = 0.5`` works as follows. First, the right-hand side
   is interpreted by Python as a real number and a ``float`` object containing
   the value 0.5 is created. Then the name ``t`` is defined as a reference
   to this object.
   
   In the statement ``s = v0*t + 0.5*a*t**2``, Python will first go to the
   right-hand side and observe that it is an arithmetic expression
   involving variables with known values (or names referring to existing
   objects). The arithmetic expression is calculated, resulting in a
   real number that is saved as a ``float`` object with value 1.025 in
   the computer's memory.
   
   Everything in Python is an object of some type. Here, ``t``, ``a``, and ``s``
   are ``float`` objects, representing real (floating-point) numbers, while
   ``v0`` is an ``int`` object, representing the integer 2.
   There are many other types of objects: strings, lists, tuples,
   dictionaries, arrays, files, ...


.. _basics:printf:

Formatted output with text and numbers
--------------------------------------

.. index:: formatting of numbers

.. index:: printf syntax

For any object ``s`` in Python, ``print s`` will (normally) print the contents
of ``s``. However, sometimes we want to combine the content of an object
with some text. Say we want to print ``s=1.025`` rather than just the number.
This is easily accomplished using *printf* syntax:

.. code-block:: python

    print 's=%g' % s

The output is specified as a string, enclosed in single or double quotes.
Inside the string, there can be "slots" for numbers (or other
objects), indicated by a percentage sign followed by a specification of
kind of data that will be inserted at this place. In the string above,
there is one such slot, ``%g``, where the ``g`` implies a real number written
as compactly as possible.

It is easy to control the number of decimals using printf syntax.
Printing out ``s=1.03``, i.e., ``s`` with two decimals, is done by

.. code-block:: python

    print 's=%.2f' % s

where the ``f`` signifies a *decimal number* and the preceding ``.2`` means
2 decimals. Scientific notation, as in ``s=1.03E+00`` (:math:`1.03\cdot 10^0`),
is specified as ``%.2E`` (2 decimals).

.. index:: format string syntax

The printf syntax is available in numerous programming languages.
Python also offers a related variant, called *format string syntax*:

.. code-block:: python

    print 's={s:.2f}'.format(s=s)

.. _basics:while:

While loops
-----------

.. index:: while loop

Suppose we want to make a table with two columns, one with :math:`t` values and
one with the corresponding :math:`s` values. Now we have to repeat a lot
of calculations with the formula :ref:`(1) <Eq:basics:seq>`. This is easily
done with a *loop*. There are two types of loops in Python: *while* loops
and *for* loops.

Let the :math:`t` values go from 0 to 2 in increments of 0.1. The following
program applies a *while* loop:

.. code-block:: python

    v0 = 2
    a = 0.2
    dt = 0.1  # Increment
    t = 0     # Start value
    while t <= 2:
        s = v0*t + 0.5*a*t**2
        print t, s
        t = t + dt

The result of running this program in a terminal window
is

.. code-block:: text

    Terminal> python while.py
    0 0.0
    0.1 0.201
    0.2 0.404
    0.3 0.609
    0.4 0.816
    0.5 1.025
    0.6 1.236
    0.7 1.449
    0.8 1.664
    0.9 1.881
    1.0 2.1
    1.1 2.321
    1.2 2.544
    1.3 2.769
    1.4 2.996
    1.5 3.225
    1.6 3.456
    1.7 3.689
    1.8 3.924
    1.9 4.161

.. index:: boolean expression

.. index:: True

.. index:: False

So, how do we interpret the contents of this program? First we initialize four
variables: ``v0``, ``a``, ``dt``, and ``t``. Everything after ``#`` on
a line is a *comment* and does not affect what happens in the program, but
is meant to be of help for a human reading the program.
Then comes the while loop:

.. code-block:: python

    while condition:
        <intented statement>
        <intented statement>
        <intented statement>

Observe the colon at the end of the ``while`` line.
The set of indented statements are repeated as long as the expression or
variable ``condition`` evaluates to ``True``. In the present case, the condition
is the *boolean expression* ``t <= 2``, so as long as the value is less
than or equal to 2,
``t <= 2`` evaluates to ``True``, otherwise it evaluates to ``False``.


.. admonition:: Error in the code

   According to the code, the last pass in the while loop should correspond
   to :math:`t=2`, but looking at the output, we see that the last print
   statement has :math:`t=1.9`. The next test in the while condition involves
   :math:`t=2` and the boolean condition is expected to be ``2 == 2``. However,
   it seems that the condition is ``False`` since the computations for
   :math:`t=2` are not printed. Why do we experience this behavior?


.. index:: Python Online Tutor

The `Python Online Tutor <http://pythontutor.com/visualize.html#mode=edit>`__
is a great tool to examine the program flow. Consider this little loop
run in the Python Online
Tutor:

.. raw:: html

    <iframe width="950" height="500" frameborder="0"
            src="http://pythontutor.com/iframe-embed.html#code=a+%3D+0%0Ada+%3D+0.4%0Awhile+a+%3C%3D+1.2%3A%0A++++print+a%0A++++a+%3D+a+%2B+da&curInstr=0&py=2&cumulative=false">
    </iframe>

The Python Online Tutor executes each statement and displays the contents
of variables such that you can track the program flow and the evolution
of variables.

So, why is not ``a`` printed for 1.2? That is, why does ``a == 1.2`` evaluate to
``True`` when ``a`` is (expected to be) 2? We must look at the accuracy of
``a`` to investigate this question and write it out with 16 decimals in
scientific notation (printf specification ``%.16E``):

.. raw:: html

    <iframe width="950" height="500" frameborder="0"
            src="http://pythontutor.com/iframe-embed.html#code=a+%3D+0%0Ada+%3D+0.4%0Awhile+a+%3C%3D+1.2%3A%0A++++print+a%0A++++a+%3D+a+%2B+da%0A++++%23+Inspect+all+decimals+in+da+and+a%0A++++print+%27da%3D%25.16E%5Cna%3D%25.16E%27+%25+%28da%2C+a%29%0A++++print+%27a+%3C%3D+1.2%3A+%25g%27+%25++%28a+%3C%3D+1.2%29&curInstr=0&py=2&cumulative=false">
    </iframe>

The problem is that ``da`` is not exactly 0.4, but contains a small round-off
error. Doing ``a = a + da`` then results in a slightly inaccurate ``a``.
When the exact ``a`` should reach 1.2, the ``a`` in the program has a
an error and equals ``1.2000000000000002``. Obviously the test for
exact equality, ``a == 1.2``, becomes ``False``, and the loop is terminated.


.. admonition:: Rule: use exact equality ``==`` when comparing real numbers

   Always compare real numbers using the difference and a tolerance:
   
   .. code-block:: python
   
       a = 1.2
       b = 0.4 + 0.4 + 0.4
       boolean_condition1 = a == b              # False
       tol = 1E-14
       boolean_condition2 = abs(a - b) < tol    # True


The Python Online Tutor is ideal for demonstrating program flow and
contents of variables in small and simple programs. However, you should
use a real debugger instead when searching for errors in real programs.

.. _basics:list:

Lists
-----

.. index:: list

The table created in the previous section has two columns of data.
We could store all the numbers in each column in a *list* object.
A list is just a collection of objects, here numbers, in a given
sequence. For example,

.. code-block:: python

    L = [-1, 1, 8.0]

is a list of three numbers. Lists are enclosed in square brackets
and may contain any type of objects separated by commas. Here
we mix a filename (string), a real number, and an integer:

.. code-block:: python

    L = ['mydata.txt', 3.14, 10]

The different list *elements* can be reached via indexing: ``L[0]`` is
the first element, ``L[1]`` is the second, and ``L[-1]`` is the last element.
Here is an *interactive Python shell* where we can write Python statements
and examine the contents of variables as we perform various operations:

.. code-block:: python

    >>> L = ['mydata.txt', 3.14, 10]
    >>> print L[0]
    mydata.txt
    >>> print L[1]
    3.14
    >>> del L[0]  # delete the first element
    >>> print L
    [3.14, 10]
    >>> print len(L)  # length of L
    2
    >>> L.append(-1)  # add -1 at the end of the list
    >>> print L
    [3.14, 10, -1]

Let us store the numbers in the previous table in lists, one for each
column. We can start with empty lists ``[]`` and use ``append`` to
add a new element to the lists in each pass of the while loop. Thereafter,
we can run a new while loop and print the contents of the lists in a
nice, tabular fashion:

.. code-block:: python

    v0 = 2
    a = 0.2
    dt = 0.1  # Increment
    t = 0
    t_values = []
    s_values = []
    while t <= 2:
        s = v0*t + 0.5*a*t**2
        t_values.append(t)
        s_values.append(s)
        t = t + dt
    print s_values  # Just take a look at a created list
    
    # Print a nicely formatted table
    i = 0
    while i <= len(t_values)-1:
        print '%.2f  %.4f' % (t_values[i], s_values[i])
        i += 1   # Same as i = i + 1

The output looks like

.. code-block:: text

    [0.0, 0.201, 0.404, 0.6090000000000001, 0.8160000000000001,
     1.025, 1.236, 1.4489999999999998, 1.664, 1.8809999999999998,
     2.0999999999999996, 2.3209999999999997, ...]
    0.00  0.0000
    0.10  0.2010
    0.20  0.4040
    0.30  0.6090
    0.40  0.8160
    0.50  1.0250
    0.60  1.2360
    0.70  1.4490
    ...

Note that ``print s_values`` here leads to output with many
decimals and small round-off errors. To get complete control of the
formatting of real numbers in the table, we use the printf
syntax.

Lists come with a lot of functionality. See the
`Python Tutorial <https://docs.python.org/2/tutorial/introduction.html#lists>`__
for many more examples.

.. _basics:for:

For loops
---------

.. index:: for loop

A for loop is used for visiting elements in a list, one by one:

.. code-block:: python

    >>> L = [1, 4, 8, 9]
    >>> for e in L:
    ...     print e
    ...
    1
    4
    8
    9

The variable ``e`` is successively set equal to the elements in the list,
in the right order. Note the colon at the end of the ``for`` line.
The statements to be executed in each pass of the loop (here only
``print e``) must be indented. When ``e`` is set equal to the last element and
all indented statements are executed, the loop is over, and the program
flow continues with the next statement that is not indented.
Try the following code out in the Python Online Tutor:

.. raw:: html

    <iframe width="950" height="500" frameborder="0"
            src="http://pythontutor.com/iframe-embed.html#code=list1+%3D+%5B0%2C+0.1%2C+0.2%5D%0Alist2+%3D+%5B%5D%0Afor+element+in+somelist%3A%0A++++p+%3D+element+%2B+2%0A++++list2.append%28p%29%0Aprint+list2&curInstr=0&py=2&cumulative=false">
    </iframe>

A for loop over the valid indices in a list is created by

.. code-block:: python

    for i in range(len(somelist)):
        # Work with somelist[i]

The ``range`` function returns a list of integers: ``range(a, b, s)``
returns the integers ``a, a+s, a+2*s, ...`` up to *but not including* ``b``.
Just writing ``range(b)`` implies ``a=0`` and ``s=1``, so
``range(len(somelist))`` returns ``[0, 1, 2]``.

.. index:: list comprehension


.. admonition:: For loops over real numbers

   The ``for i in range(...)`` construction can only run a loop over integers.
   If you need a loop over real values, you have to either create a list
   with the values first, or use a formula where each value is generated
   through an integer counter:
   
   .. code-block:: python
   
       # Need loop over 0, 0.1, 0.2, ..., 1
       values = []
       for i in range(11):
           values.append(i*0.1)
       
       # Shorter version using list comprehension (same as the loop above)
       values = [i*0.1 for i in range(11)]
       
       for value in values:
           # work with value


We can now rewrite our program that used lists and while loops to use
for loops instead:

.. code-block:: python

    v0 = 2
    a = 0.2
    dt = 0.1  # Increment
    t_values = []
    s_values = []
    n = int(round(2/dt)) + 1  # No of t values
    for i in range(n):
        t = i*dt
        s = v0*t + 0.5*a*t**2
        t_values.append(t)
        s_values.append(s)
    print s_values  # Just take a look at a created list
    
    # Make nicely formatted table
    for t, s in zip(t_values, s_values):
        print '%.2f  %.4f' % (t, s)
    
    # Alternative implementation
    for i in range(len(t_values)):
        print '%.2f  %.4f' % (t_values[i], s_values[i])

Observe that we have here used a slightly different technique
for computing the :math:`t`
values inside the first loop: now we set :math:`t` as :math:`i\Delta t`, where
:math:`\Delta t` (``dt`` in the code) is the increment (0.1) between each
:math:`t` value. The computation of ``n``, the number of :math:`t` values,
makes use of ``round`` to make a correct mathematical rounding to the nearest
integer (and ``int`` makes an integer object out of the rounded real number).
(In an interval :math:`[a,b]` divided into subintervals of equal length :math:`\Delta t`,
there will be :math:`1 + (b-a)/\Delta t` points in total.)

.. index:: zip

.. index:: multiple lists traversal with zip

Running through multiple lists simultaneously is done with the ``zip``
construction:

.. code-block:: python

    for e1, e2, e3, ... in zip(list1, list2, list3, ...):

One may instead create a for loop over all the legal index values
instead and index each array,

.. code-block:: python

    for i in range(len(list1)):
        e1 = list1[i]
        e2 = list2[i]
        ...

.. _basics:numpy:

Arrays
------

.. index:: array

.. index:: numpy

Lists are useful for collecting a set of numbers or other objects
in a single variable. Arrays are much like lists, but tailored for collection
of numbers. The primary advantage of arrays is that you can use
them very efficiently and conveniently in mathematical computations,
but the downside is that an array has (in practice) a fixed length and
all elements must be of the same type. This is usually no important
restriction in scientific computations.

Much of Python's functionality are coded in *modules*. To use such
functionality, one must *import* the relevant modules. Arrays, for
example, are available in the ``numpy`` module, which must be imported.
Functions or variables in the module must be prefixed by ``numpy``.
This will be demonstrated below.
Here is an example involving basic operations with arrays:

.. code-block:: python

    >>> L = [1, 4, 10.0]    # List of numbers
    >>> import numpy
    >>> a = numpy.array(L)  # Make corresponding array
    >>> print a
    [  1.   4.  10.]
    >>> print a[1]
    4.0
    >>> print a.dtype       # Data type of an element
    float64
    >>> b = 2*a + 1
    >>> print b
    [  3.   9.  21.]

Note that all elements in the ``a`` array are of ``float`` type
(because one element in ``L`` was ``float``).
Arithmetic expressions such as ``2*a+1`` work with ``a`` as array, but not
as list. In fact, we can pass arrays to mathematical functions:

.. code-block:: python

    >>> c = numpy.log(a)
    >>> print c
    [ 0.          1.38629436  2.30258509]

.. index:: linspace

The ``numpy`` module has a lot of very useful utilities. To create
:math:`n+1` uniformly distributed
coordinates in an interval :math:`[a,b]`, stored in an array, one
can use ``linspace``:

.. code-block:: python

    t = numpy.linspace(a, b, n+1)

This construction makes it easy to create arrays for the :math:`t` and :math:`s`
values in our tables:

.. code-block:: python

    import numpy
    v0 = 2
    a = 0.2
    dt = 0.1  # Increment
    n = int(round(2/dt)) + 1  # No of t values
    
    t_values = numpy.linspace(0, 2, n+1)
    s_values = v0*t + 0.5*a*t**2
    
    # Make nicely formatted table
    for t, s in zip(t_values, s_values):
        print '%.2f  %.4f' % (t, s)

.. _basics:math:

Mathematical functions
----------------------

.. index:: mathematical functions

.. index:: math

Python offers access to all standard mathematical functions such as
:math:`\sin x`, :math:`\cos x`, :math:`\tan x`, :math:`\sinh x`, :math:`\cosh x`, :math:`\tanh x`,
all their inverses (called ``asin(x)``, ``asinh(x)``, and so forth), :math:`e^x`
(``exp(x)``), :math:`\ln x` (``log(x)``), and :math:`x!` (``factorial(x)``).
However, one has to import a module to get access to these functions.
For scalars (single numbers) the relevant module is ``math``:

.. code-block:: python

    >>> import math
    >>> print math.sin(math.pi)
    1.2246467991473532e-16

which shows that the sine function is only approximate (to 16 digits).
Many prefer to write mathematical expressions without the ``math`` prefix:

.. code-block:: python

    from math import sin, pi
    print sin(pi)
    
    # Or import everything from math
    from math import *
    print sin(pi), log(e), tanh(0.5)

The ``numpy`` module contains sine, cosine, and other mathematical functions
that work on scalars as well as arrays.

.. index:: import of modules


.. admonition:: Import of ``numpy``

   To get Python code that is as similar to Matlab as possible, one would
   do a "start import" of everything,
   
   .. code-block:: python
   
       from numpy import *
       x = linspace(0, 1, 101)
       y = exp(-x)*sin(pi*x)
   
   However, in the Python community it has been a culture to use
   a prefix ``np`` as abbreviation for ``numpy``:
   
   .. code-block:: python
   
       import numpy as np
       x = np.linspace(0, 1, 101)
       y = np.exp(-x)*np.sin(np.pi*x)
   
   Our convention is to use the ``np`` prefix for typical Matlab functions,
   but skip the prefix when working with mathematical functions like
   `exp(x)*sin(pi*x)`to get a one-to-one correspondence between formulas
   in the program and in the mathematical description of the problem.
   
   .. code-block:: python
   
       import numpy as np
       from numpy import sin, exp, pi
       x = np.linspace(0, 1, 101)
       y = exp(-x)*sin(pi*x)


.. _basics:plot:

Plotting
--------

.. index:: plotting

.. index:: graphics

.. index:: visualization

.. index:: matplotlib

We can easily make a graph of a function :math:`s(t)` using the module
``matplotlib``. The technique is to compute an array of :math:`t` values
and a corresponding array of function values :math:`s(t)`. Plotting
programs will draw straight lines between the points on the curve,
so a sufficient number of points are needed to give the impression
of a smooth curve. Our :math:`s(t)` function is plotted by the following
code:

.. code-block:: python

    import numpy as np
    import matplotlib.pyplot as plt
    
    v0 = 0.2
    a = 2
    n = 21  # No of t values for plotting
    
    t = np.linspace(0, 2, n+1)
    s = v0*t + 0.5*a*t**2
    
    plt.plot(t, s)
    plt.savefig('myplot.png')
    plt.show()

The plotfile ``myplot.png`` looks like

.. figure:: plot1_s.png
   :width: 500

Matlab users may prefer to do

.. code-block:: python

    from numpy import *
    from matplotlib.pyplot import *

such that they can use ``linspace`` and ``plot`` without any prefix, just as
in Matlab.

Two curves can easily be plotted, this time also with labels on the
axis and a box with legends such that we can distinguish the two curves:

.. code-block:: python

    import numpy as np
    import matplotlib.pyplot as plt
    
    v0 = 0.2
    a = 2
    n = 21  # No of t values for plotting
    
    t = np.linspace(0, 2, n+1)
    s = v0*t + 0.5*a*t**2
    
    plt.plot(t, s)
    plt.savefig('myplot.png')
    plt.show()

.. figure:: plot2_s.png
   :width: 500