$$ \newcommand{\tp}{\thinspace .} $$

This chapter is taken from the book A Primer on Scientific Programming with Python by H. P. Langtangen, 5th edition, Springer, 2016.

Summary

Chapter topics

While loops

Loops are used to repeat a collection of program statements several times. The statements that belong to the loop must be consistently indented in Python. A while loop runs as long as a condition evaluates to True:

>>> t = 0; dt = 0.5; T = 2
>>> while t <= T:
...      print t
...      t += dt
...
0
0.5
1.0
1.5
2.0
>>> print 'Final t:', t, '; t <= T is', t <= T
Final t: 2.5 ; t <= T is False

Lists

A list is used to collect a number of values or variables in an ordered sequence.

>>> mylist = [t, dt, T, 'mynumbers.dat', 100]
A list element can be any Python object, including numbers, strings, functions, and other lists, for instance.

The table below shows some important list operations (only a subset of these are explained in the present document).

Construction Meaning
a = [] initialize an empty list
a = [1, 4.4, 'run.py'] initialize a list
a.append(elem) add elem object to the end
a + [1,3] add two lists
a.insert(i, e) insert element e before index i
a[3] index a list element
a[-1] get last list element
a[1:3] slice: copy data to sublist (here: index 1, 2)
del a[3] delete an element (index 3)
a.remove(e) remove an element with value e
a.index('run.py') find index corresponding to an element's value
'run.py' in a test if a value is contained in the list
a.count(v) count how many elements that have the value v
len(a) number of elements in list a
min(a) the smallest element in a
max(a) the largest element in a
sum(a) add all elements in a
sorted(a) return sorted version of list a
reversed(a) return reversed sorted version of list a
b[3][0][2] nested list indexing
isinstance(a, list) is True if a is a list
type(a) is list is True if a is a list

Nested lists

If the list elements are also lists, we have a nested list. The following session summarizes indexing and loop traversal of nested lists:

>>> nl = [[0, 0, 1], [-1, -1, 2], [-10, 10, 5]]
>>> nl[0]
[0, 0, 1]
>>> nl[-1]
[-10, 10, 5]
>>> nl[0][2]
1
>>> nl[-1][0]
-10
>>> for p in nl:
...     print p
...
[0, 0, 1]
[-1, -1, 2]
[-10, 10, 5]
>>> for a, b, c in nl:
...     print '%3d %3d %3d' % (a, b, c)
...
  0   0   1
 -1  -1   2
-10  10   5

Tuples

A tuple can be viewed as a constant list: no changes in the contents of the tuple is allowed. Tuples employ standard parentheses or no parentheses, and elements are separated with comma as in lists:

>>> mytuple = (t, dt, T, 'mynumbers.dat', 100)
>>> mytuple =  t, dt, T, 'mynumbers.dat', 100
Many list operations are also valid for tuples, but those that changes the list content cannot be used with tuples (examples are append, del, remove, index, and sort).

An object a containing an ordered collection of other objects such that a[i] refers to object with index i in the collection, is known as a sequence in Python. Lists, tuples, strings, and arrays are examples on sequences. You choose a sequence type when there is a natural ordering of elements. For a collection of unordered objects a dictionary is often more convenient.

For loops

A for loop is used to run through the elements of a list or a tuple:

>>> for elem in [10, 20, 25, 27, 28.5]:
...     print elem,
...
10 20 25 27 28.5
The trailing comma after the print statement prevents the newline character, which otherwise print would automatically add.

The range function is frequently used in for loops over a sequence of integers. Recall that range(start, stop, inc) does not include the upper limit stop among the list item.

>>> for elem in range(1, 5, 2):
...     print elem,
...
1 3
>>> range(1, 5, 2)
[1, 3]

Implementation of a sum \( \sum_{j=M}^N q(j) \), where \( q(j) \) is some mathematical expression involving the integer counter \( j \), is normally implemented using a for loop. Choosing, e.g., \( q(j) = 1/j^{2} \), the sum is calculated by

s = 0  # accumulation variable
for j in range(M, N+1, 1):
    s += 1./j**2

Pretty print

To print a list a, print a can be used, but the pprint and scitools.pprint2 modules and their pprint function give a nicer layout of the output for long and nested lists. The scitools.pprint2 module has the possibility to control the formatting of floating-point numbers.

Terminology

The important computer science terms in this document are

Example: Analyzing list data

Problem

The file src/misc/Oxford_sun_hours.txt contains data of the number of sun hours in Oxford, UK, for every month since January 1929. The data are already on a suitable nested list format:

[
[43.8, 60.5, 190.2, ...],
[49.9, 54.3, 109.7, ...],
[63.7, 72.0, 142.3, ...],
...
]
The list in every line holds the number of sun hours for each of the year's 12 months. That is, the first index in the nested list corresponds to year and the second index corresponds to the month number. More precisely, the double index [i][j] corresponds to year 1929 + i and month 1 + j (January being month number 1).

The task is to define this nested list in a program and do the following data analysis.

Solution

Initializing the data is easy: just copy the data from the Oxford_sun_hours.txt file into the program file and set a variable name on the left hand side (the long and wide code is only indicated here):

data = [
[43.8, 60.5, 190.2, ...],
[49.9, 54.3, 109.7, ...],
[63.7, 72.0, 142.3, ...],
...
]

For task 1, we need to establish a list monthly_mean with the results from the computation, i.e., monthly_mean[2] holds the average number of sun hours for March in the period 1929-2009. The average is computed in the standard way: for each month, we run through all the years, sum up the values, and finally divide by the number of years (len(data) or \( 2009-1929+1 \)).

monthly_mean = []
n = len(data)   # no of years
for m in range(12): # counter for month indices
    s = 0           # sum
    for y in data:  # loop over "rows" (first index) in data
        s += y[m]   # add value for month m
    monthly_mean.append(s/n)

An alternative solution would be to introduce separate variables for the monthly averages, say Jan_mean, Feb_mean, etc. The reader should as an exercise write the code associated with such a solution and realize that using the monthly_mean list is more elegant and yields much simpler and shorter code. Separate variables might be an okay solution for 2-3 variables, but not for as many as 12.

Perhaps we want a nice-looking printout of the results. This can elegantly be created by first defining a tuple (or list) of the names of the months and then running through this list in parallel with monthly_mean:

month_names = 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',\ 
              'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
for name, value in zip(month_names, monthly_mean):
    print '%s: %.1f' % (name, value)
The printout becomes

Jan: 56.6
Feb: 72.7
Mar: 116.5
Apr: 153.2
May: 191.1
Jun: 198.5
Jul: 193.8
Aug: 184.3
Sep: 138.3
Oct: 104.6
Nov: 67.4
Dec: 52.4

Task 2 can be solved by pure inspection of the above printout, which reveals that June is the winner. However, since we are learning programming, we should be able to replace our eyes with some computer code to automate the task. The maximum value max_value of a list like monthly_mean is simply obtained by max(monthly_mean). The corresponding index, needed to find the right name of the corresponding month, is found from monthly_mean.index(max_value). The code for task 2 is then

max_value = max(monthly_mean)
month = month_names[monthly_mean.index(max_value)]
print '%s has best weather with %.1f sun hours on average' % \ 
      (month, max_value)

Task 3 requires us to first develop an algorithm for how to compute the decade averages. The algorithm, expressed with words, goes as follows. We loop over the decades, and for each decade, we loop over its years, and for each year, we add the December data of the previous year and the January data of the current year to an accumulation variable. Dividing this accumulation variable by \( 10\cdot 2\cdot 30 \) gives the average number of sun hours per day in the winter time for the particular decade. The code segment below expresses this algorithm in the Python language:

decade_mean = []
for decade_start in range(1930, 2010, 10):
    Jan_index = 0; Dec_index = 11  # indices
    s = 0
    for year in range(decade_start, decade_start+10):
        y = year - 1929  # list index
        print data[y-1][Dec_index] + data[y][Jan_index]
        s += data[y-1][Dec_index] + data[y][Jan_index]
    decade_mean.append(s/(20.*30))
for i in range(len(decade_mean)):
    print 'Decade %d-%d: %.1f' % \ 
          (1930+i*10, 1939+i*10, decade_mean[i])
The output becomes

Decade 1930-1939: 1.7
Decade 1940-1949: 1.8
Decade 1950-1959: 1.8
Decade 1960-1969: 1.8
Decade 1970-1979: 1.6
Decade 1980-1989: 2.0
Decade 1990-1999: 1.8
Decade 2000-2009: 2.1
The complete code is found in the file sun_data.py.

Remark

The file Oxford_sun_hours.txt is based on data from the UK Met Office.

How to find more Python information

This document contains only fragments of the Python language. When doing your own projects or exercises you will certainly feel the need for looking up more detailed information on modules, objects, etc. Fortunately, there is a lot of excellent documentation on the Python programming language.

The primary reference is the official Python documentation website: docs.python.org. Here you can find a Python tutorial, the very useful Library Reference [1], and a Language Reference, to mention some key documents. When you wonder what functions you can find in a module, say the math module, you can go to the Library Reference search for math, which quickly leads you to the official documentation of the math module. Alternatively, you can go to the index of this document and pick the math (module) item directly. Similarly, if you want to look up more details of the printf formatting syntax, go to the index and follow the printf-style formatting index.

Warning.

A word of caution is probably necessary here. Reference manuals are very technical and written primarily for experts, so it can be quite difficult for a newbie to understand the information. An important ability is to browse such manuals and dig out the key information you are looking for, without being annoyed by all the text you do not understand. As with programming, reading manuals efficiently requires a lot of training.

A tool somewhat similar to the Python Standard Library documentation is the pydoc program. In a terminal window you write

Terminal> pydoc math
In IPython there are two corresponding possibilities, either

In [1]: !pydoc math
or

In [2]: import math
In [3]: help(math)
The documentation of the complete math module is shown as plain text. If a specific function is wanted, we can ask for that directly, e.g., pydoc math.tan. Since pydoc is very fast, many prefer pydoc over web pages, but pydoc has often less information compared to the web documentation of modules.

There are also a large number of books about Python. Beazley [2] is an excellent reference that improves and extends the information in the web documentation. The Learning Python book [3] has been very popular for many years as an introduction to the language. There is a special web page listing most Python books on the market. Very few books target scientific computing with Python, but [4] gives an introduction to Python for mathematical applications and is more compact and advanced than the present book. It also serves as an excellent reference for the capabilities of Python in a scientific context. A comprehensive book on the use of Python for assisting and automating scientific work is [5].

Quick references, which list almost to all Python functionality in compact tabular form, are very handy. We recommend in particular the one by Richard Gruet [6].

The website http://www.python.org/doc/ contains a list of useful Python introductions and reference manuals.