.. !split .. _basics:basic:objects: Variables, loops, lists, and arrays =================================== .. _basics:accesspy: Getting access to Python ------------------------ Simple mathematical calculations can be done in plain Python, but for more advanced scientific computing, as we do here, several add-on packages are needed. Getting all software correctly installed used to be quite challenging, but today there are several easy-to-use approaches to get access to Python. Mac and Windows ~~~~~~~~~~~~~~~ We recommend to download and install `Anaconda `__, which is a free distribution of Python that comes with most of the packages you need for advanced scientific computing. Ubuntu ~~~~~~ Debian-based Linux systems, such as Ubuntu, can also use Anaconda, but it is more common to use the ``apt-get`` command-line tool or the Ubuntu installer to install a set of Debian packages. Here is a list of the packages you need to install for this introduction: .. code-block:: text Terminal> sudo apt-get install python-pip python-setuptools \ python-scipy python-sympy python-cython python-matplotlib \ python-dev python-profiler pydb spyder imagemagick gedit vim \ emacs python-mode git mercurial lib-avtools gnome-terminal In addition, run .. code-block:: text Terminal> pip install nose Terminal> pip install pytest Terminal> pip install ipython --upgrade Terminal> pip install tornado --upgrade Terminal> pip install pyzmq --upgrade Web access ~~~~~~~~~~ You can also access Python directly through a web browser without having it installed on your local computer. We refer to the document `How to access Python for doing scientific computing `__ for more details on such tools and also installations based on Anaconda and Ubuntu. Mathematical example -------------------- We shall use a famous mathematical formula in our forthcoming programming examples. The formula tells how long distance :math:`s` an object has moved in a time interval :math:`[0,t]` if it starts out with a velocity :math:`v_0` and undergoes constant acceleration :math:`a`: .. _Eq:basics:seq: .. math:: \tag{1} s = v_0t + \frac{1}{2}at^2\thinspace . We may view :math:`s` as a function of :math:`t`: :math:`s(t)`, and also include the parameters in the notation: :math:`s(t;v_0,a)`. .. _basics:formula:eval: A program for evaluating a formula ---------------------------------- .. index:: text editor .. index:: program file Here is a Python program for computing :math:`s`, given :math:`t=0.5`, :math:`v_0=2`, and :math:`a=0.2`: .. code-block:: python t = 0.5 v0 = 2 a = 0.2 s = v0*t + 0.5*a*t**2 print s The program is pure text and must be placed in a *pure text file* using a *text editor*. Popular text editors are Gedit, Nano, Emacs, and Vim on Linux, TextWrangler on Mac OS X, and Notepad++ on Windows. Save the text to a program file whose name ends in ``.py``, say ``distance.py``. The program is run in a terminal window (Command Prompt on Windows, Terminal application on Mac OS X, ``xterm`` or ``gnome-terminal`` on Linux): .. code-block:: text Terminal> python distance.py 1.025 The result of the ``print`` statement is the number ``1.025`` in the terminal window. As an alternative to writing programs in a text editor and executing them in a terminal window, you may use the `Spyder `__ graphical interface which gives a more Matlab-style environment to work in. Anaconda installations come with Spyder. .. index:: assignment statement .. index:: print statement .. index:: variable The ``distance.py`` program first contains four *assignment statements* where we assign numbers or the results of arithmetic expressions to *variables*. The variables have names coinciding with the mathematical notation used in the formula: ``t``, ``v0``, ``a``, and ``s``. You may think of variables in this programs just as variables in mathematics. .. index:: objects .. index:: float .. index:: int .. admonition:: More technical details A statement like ``t = 0.5`` works as follows. First, the right-hand side is interpreted by Python as a real number and a ``float`` object containing the value 0.5 is created. Then the name ``t`` is defined as a reference to this object. In the statement ``s = v0*t + 0.5*a*t**2``, Python will first go to the right-hand side and observe that it is an arithmetic expression involving variables with known values (or names referring to existing objects). The arithmetic expression is calculated, resulting in a real number that is saved as a ``float`` object with value 1.025 in the computer's memory. Everything in Python is an object of some type. Here, ``t``, ``a``, and ``s`` are ``float`` objects, representing real (floating-point) numbers, while ``v0`` is an ``int`` object, representing the integer 2. There are many other types of objects: strings, lists, tuples, dictionaries, arrays, files, ... .. _basics:printf: Formatted output with text and numbers -------------------------------------- .. index:: formatting of numbers .. index:: printf syntax For any object ``s`` in Python, ``print s`` will (normally) print the contents of ``s``. However, sometimes we want to combine the content of an object with some text. Say we want to print ``s=1.025`` rather than just the number. This is easily accomplished using *printf* syntax: .. code-block:: python print 's=%g' % s The output is specified as a string, enclosed in single or double quotes. Inside the string, there can be "slots" for numbers (or other objects), indicated by a percentage sign followed by a specification of kind of data that will be inserted at this place. In the string above, there is one such slot, ``%g``, where the ``g`` implies a real number written as compactly as possible. It is easy to control the number of decimals using printf syntax. Printing out ``s=1.03``, i.e., ``s`` with two decimals, is done by .. code-block:: python print 's=%.2f' % s where the ``f`` signifies a *decimal number* and the preceding ``.2`` means 2 decimals. Scientific notation, as in ``s=1.03E+00`` (:math:`1.03\cdot 10^0`), is specified as ``%.2E`` (2 decimals). .. index:: format string syntax The printf syntax is available in numerous programming languages. Python also offers a related variant, called *format string syntax*: .. code-block:: python print 's={s:.2f}'.format(s=s) .. _basics:while: While loops ----------- .. index:: while loop Suppose we want to make a table with two columns, one with :math:`t` values and one with the corresponding :math:`s` values. Now we have to repeat a lot of calculations with the formula :ref:`(1) `. This is easily done with a *loop*. There are two types of loops in Python: *while* loops and *for* loops. Let the :math:`t` values go from 0 to 2 in increments of 0.1. The following program applies a *while* loop: .. code-block:: python v0 = 2 a = 0.2 dt = 0.1 # Increment t = 0 # Start value while t <= 2: s = v0*t + 0.5*a*t**2 print t, s t = t + dt The result of running this program in a terminal window is .. code-block:: text Terminal> python while.py 0 0.0 0.1 0.201 0.2 0.404 0.3 0.609 0.4 0.816 0.5 1.025 0.6 1.236 0.7 1.449 0.8 1.664 0.9 1.881 1.0 2.1 1.1 2.321 1.2 2.544 1.3 2.769 1.4 2.996 1.5 3.225 1.6 3.456 1.7 3.689 1.8 3.924 1.9 4.161 .. index:: boolean expression .. index:: True .. index:: False So, how do we interpret the contents of this program? First we initialize four variables: ``v0``, ``a``, ``dt``, and ``t``. Everything after ``#`` on a line is a *comment* and does not affect what happens in the program, but is meant to be of help for a human reading the program. Then comes the while loop: .. code-block:: python while condition: Observe the colon at the end of the ``while`` line. The set of indented statements are repeated as long as the expression or variable ``condition`` evaluates to ``True``. In the present case, the condition is the *boolean expression* ``t <= 2``, so as long as the value is less than or equal to 2, ``t <= 2`` evaluates to ``True``, otherwise it evaluates to ``False``. .. admonition:: Error in the code According to the code, the last pass in the while loop should correspond to :math:`t=2`, but looking at the output, we see that the last print statement has :math:`t=1.9`. The next test in the while condition involves :math:`t=2` and the boolean condition is expected to be ``2 == 2``. However, it seems that the condition is ``False`` since the computations for :math:`t=2` are not printed. Why do we experience this behavior? .. index:: Python Online Tutor The `Python Online Tutor `__ is a great tool to examine the program flow. Consider this little loop run in the Python Online Tutor: .. raw:: html The Python Online Tutor executes each statement and displays the contents of variables such that you can track the program flow and the evolution of variables. So, why is not ``a`` printed for 1.2? That is, why does ``a == 1.2`` evaluate to ``True`` when ``a`` is (expected to be) 2? We must look at the accuracy of ``a`` to investigate this question and write it out with 16 decimals in scientific notation (printf specification ``%.16E``): .. raw:: html The problem is that ``da`` is not exactly 0.4, but contains a small round-off error. Doing ``a = a + da`` then results in a slightly inaccurate ``a``. When the exact ``a`` should reach 1.2, the ``a`` in the program has a an error and equals ``1.2000000000000002``. Obviously the test for exact equality, ``a == 1.2``, becomes ``False``, and the loop is terminated. .. admonition:: Rule: use exact equality ``==`` when comparing real numbers Always compare real numbers using the difference and a tolerance: .. code-block:: python a = 1.2 b = 0.4 + 0.4 + 0.4 boolean_condition1 = a == b # False tol = 1E-14 boolean_condition2 = abs(a - b) < tol # True The Python Online Tutor is ideal for demonstrating program flow and contents of variables in small and simple programs. However, you should use a real debugger instead when searching for errors in real programs. .. _basics:list: Lists ----- .. index:: list The table created in the previous section has two columns of data. We could store all the numbers in each column in a *list* object. A list is just a collection of objects, here numbers, in a given sequence. For example, .. code-block:: python L = [-1, 1, 8.0] is a list of three numbers. Lists are enclosed in square brackets and may contain any type of objects separated by commas. Here we mix a filename (string), a real number, and an integer: .. code-block:: python L = ['mydata.txt', 3.14, 10] The different list *elements* can be reached via indexing: ``L[0]`` is the first element, ``L[1]`` is the second, and ``L[-1]`` is the last element. Here is an *interactive Python shell* where we can write Python statements and examine the contents of variables as we perform various operations: .. code-block:: python >>> L = ['mydata.txt', 3.14, 10] >>> print L[0] mydata.txt >>> print L[1] 3.14 >>> del L[0] # delete the first element >>> print L [3.14, 10] >>> print len(L) # length of L 2 >>> L.append(-1) # add -1 at the end of the list >>> print L [3.14, 10, -1] Let us store the numbers in the previous table in lists, one for each column. We can start with empty lists ``[]`` and use ``append`` to add a new element to the lists in each pass of the while loop. Thereafter, we can run a new while loop and print the contents of the lists in a nice, tabular fashion: .. code-block:: python v0 = 2 a = 0.2 dt = 0.1 # Increment t = 0 t_values = [] s_values = [] while t <= 2: s = v0*t + 0.5*a*t**2 t_values.append(t) s_values.append(s) t = t + dt print s_values # Just take a look at a created list # Print a nicely formatted table i = 0 while i <= len(t_values)-1: print '%.2f %.4f' % (t_values[i], s_values[i]) i += 1 # Same as i = i + 1 The output looks like .. code-block:: text [0.0, 0.201, 0.404, 0.6090000000000001, 0.8160000000000001, 1.025, 1.236, 1.4489999999999998, 1.664, 1.8809999999999998, 2.0999999999999996, 2.3209999999999997, ...] 0.00 0.0000 0.10 0.2010 0.20 0.4040 0.30 0.6090 0.40 0.8160 0.50 1.0250 0.60 1.2360 0.70 1.4490 ... Note that ``print s_values`` here leads to output with many decimals and small round-off errors. To get complete control of the formatting of real numbers in the table, we use the printf syntax. Lists come with a lot of functionality. See the `Python Tutorial `__ for many more examples. .. _basics:for: For loops --------- .. index:: for loop A for loop is used for visiting elements in a list, one by one: .. code-block:: python >>> L = [1, 4, 8, 9] >>> for e in L: ... print e ... 1 4 8 9 The variable ``e`` is successively set equal to the elements in the list, in the right order. Note the colon at the end of the ``for`` line. The statements to be executed in each pass of the loop (here only ``print e``) must be indented. When ``e`` is set equal to the last element and all indented statements are executed, the loop is over, and the program flow continues with the next statement that is not indented. Try the following code out in the Python Online Tutor: .. raw:: html A for loop over the valid indices in a list is created by .. code-block:: python for i in range(len(somelist)): # Work with somelist[i] The ``range`` function returns a list of integers: ``range(a, b, s)`` returns the integers ``a, a+s, a+2*s, ...`` up to *but not including* ``b``. Just writing ``range(b)`` implies ``a=0`` and ``s=1``, so ``range(len(somelist))`` returns ``[0, 1, 2]``. .. index:: list comprehension .. admonition:: For loops over real numbers The ``for i in range(...)`` construction can only run a loop over integers. If you need a loop over real values, you have to either create a list with the values first, or use a formula where each value is generated through an integer counter: .. code-block:: python # Need loop over 0, 0.1, 0.2, ..., 1 values = [] for i in range(11): values.append(i*0.1) # Shorter version using list comprehension (same as the loop above) values = [i*0.1 for i in range(11)] for value in values: # work with value We can now rewrite our program that used lists and while loops to use for loops instead: .. code-block:: python v0 = 2 a = 0.2 dt = 0.1 # Increment t_values = [] s_values = [] n = int(round(2/dt)) + 1 # No of t values for i in range(n): t = i*dt s = v0*t + 0.5*a*t**2 t_values.append(t) s_values.append(s) print s_values # Just take a look at a created list # Make nicely formatted table for t, s in zip(t_values, s_values): print '%.2f %.4f' % (t, s) # Alternative implementation for i in range(len(t_values)): print '%.2f %.4f' % (t_values[i], s_values[i]) Observe that we have here used a slightly different technique for computing the :math:`t` values inside the first loop: now we set :math:`t` as :math:`i\Delta t`, where :math:`\Delta t` (``dt`` in the code) is the increment (0.1) between each :math:`t` value. The computation of ``n``, the number of :math:`t` values, makes use of ``round`` to make a correct mathematical rounding to the nearest integer (and ``int`` makes an integer object out of the rounded real number). (In an interval :math:`[a,b]` divided into subintervals of equal length :math:`\Delta t`, there will be :math:`1 + (b-a)/\Delta t` points in total.) .. index:: zip .. index:: multiple lists traversal with zip Running through multiple lists simultaneously is done with the ``zip`` construction: .. code-block:: python for e1, e2, e3, ... in zip(list1, list2, list3, ...): One may instead create a for loop over all the legal index values instead and index each array, .. code-block:: python for i in range(len(list1)): e1 = list1[i] e2 = list2[i] ... .. _basics:numpy: Arrays ------ .. index:: array .. index:: numpy Lists are useful for collecting a set of numbers or other objects in a single variable. Arrays are much like lists, but tailored for collection of numbers. The primary advantage of arrays is that you can use them very efficiently and conveniently in mathematical computations, but the downside is that an array has (in practice) a fixed length and all elements must be of the same type. This is usually no important restriction in scientific computations. Much of Python's functionality are coded in *modules*. To use such functionality, one must *import* the relevant modules. Arrays, for example, are available in the ``numpy`` module, which must be imported. Functions or variables in the module must be prefixed by ``numpy``. This will be demonstrated below. Here is an example involving basic operations with arrays: .. code-block:: python >>> L = [1, 4, 10.0] # List of numbers >>> import numpy >>> a = numpy.array(L) # Make corresponding array >>> print a [ 1. 4. 10.] >>> print a[1] 4.0 >>> print a.dtype # Data type of an element float64 >>> b = 2*a + 1 >>> print b [ 3. 9. 21.] Note that all elements in the ``a`` array are of ``float`` type (because one element in ``L`` was ``float``). Arithmetic expressions such as ``2*a+1`` work with ``a`` as array, but not as list. In fact, we can pass arrays to mathematical functions: .. code-block:: python >>> c = numpy.log(a) >>> print c [ 0. 1.38629436 2.30258509] .. index:: linspace The ``numpy`` module has a lot of very useful utilities. To create :math:`n+1` uniformly distributed coordinates in an interval :math:`[a,b]`, stored in an array, one can use ``linspace``: .. code-block:: python t = numpy.linspace(a, b, n+1) This construction makes it easy to create arrays for the :math:`t` and :math:`s` values in our tables: .. code-block:: python import numpy v0 = 2 a = 0.2 dt = 0.1 # Increment n = int(round(2/dt)) + 1 # No of t values t_values = numpy.linspace(0, 2, n+1) s_values = v0*t + 0.5*a*t**2 # Make nicely formatted table for t, s in zip(t_values, s_values): print '%.2f %.4f' % (t, s) .. _basics:math: Mathematical functions ---------------------- .. index:: mathematical functions .. index:: math Python offers access to all standard mathematical functions such as :math:`\sin x`, :math:`\cos x`, :math:`\tan x`, :math:`\sinh x`, :math:`\cosh x`, :math:`\tanh x`, all their inverses (called ``asin(x)``, ``asinh(x)``, and so forth), :math:`e^x` (``exp(x)``), :math:`\ln x` (``log(x)``), and :math:`x!` (``factorial(x)``). However, one has to import a module to get access to these functions. For scalars (single numbers) the relevant module is ``math``: .. code-block:: python >>> import math >>> print math.sin(math.pi) 1.2246467991473532e-16 which shows that the sine function is only approximate (to 16 digits). Many prefer to write mathematical expressions without the ``math`` prefix: .. code-block:: python from math import sin, pi print sin(pi) # Or import everything from math from math import * print sin(pi), log(e), tanh(0.5) The ``numpy`` module contains sine, cosine, and other mathematical functions that work on scalars as well as arrays. .. index:: import of modules .. admonition:: Import of ``numpy`` To get Python code that is as similar to Matlab as possible, one would do a "start import" of everything, .. code-block:: python from numpy import * x = linspace(0, 1, 101) y = exp(-x)*sin(pi*x) However, in the Python community it has been a culture to use a prefix ``np`` as abbreviation for ``numpy``: .. code-block:: python import numpy as np x = np.linspace(0, 1, 101) y = np.exp(-x)*np.sin(np.pi*x) Our convention is to use the ``np`` prefix for typical Matlab functions, but skip the prefix when working with mathematical functions like `exp(x)*sin(pi*x)`to get a one-to-one correspondence between formulas in the program and in the mathematical description of the problem. .. code-block:: python import numpy as np from numpy import sin, exp, pi x = np.linspace(0, 1, 101) y = exp(-x)*sin(pi*x) .. _basics:plot: Plotting -------- .. index:: plotting .. index:: graphics .. index:: visualization .. index:: matplotlib We can easily make a graph of a function :math:`s(t)` using the module ``matplotlib``. The technique is to compute an array of :math:`t` values and a corresponding array of function values :math:`s(t)`. Plotting programs will draw straight lines between the points on the curve, so a sufficient number of points are needed to give the impression of a smooth curve. Our :math:`s(t)` function is plotted by the following code: .. code-block:: python import numpy as np import matplotlib.pyplot as plt v0 = 0.2 a = 2 n = 21 # No of t values for plotting t = np.linspace(0, 2, n+1) s = v0*t + 0.5*a*t**2 plt.plot(t, s) plt.savefig('myplot.png') plt.show() The plotfile ``myplot.png`` looks like .. figure:: plot1_s.png :width: 500 Matlab users may prefer to do .. code-block:: python from numpy import * from matplotlib.pyplot import * such that they can use ``linspace`` and ``plot`` without any prefix, just as in Matlab. Two curves can easily be plotted, this time also with labels on the axis and a box with legends such that we can distinguish the two curves: .. code-block:: python import numpy as np import matplotlib.pyplot as plt v0 = 0.2 a = 2 n = 21 # No of t values for plotting t = np.linspace(0, 2, n+1) s = v0*t + 0.5*a*t**2 plt.plot(t, s) plt.savefig('myplot.png') plt.show() .. figure:: plot2_s.png :width: 500