This chapter is taken from the book A Primer on Scientific Programming with Python by H. P. Langtangen, 5th edition, Springer, 2016.
Loops are used to repeat a collection of program statements several
times.
The statements that belong to the loop must be consistently
indented in Python.
A while
loop runs as long as a condition evaluates to True
:
>>> t = 0; dt = 0.5; T = 2
>>> while t <= T:
... print t
... t += dt
...
0
0.5
1.0
1.5
2.0
>>> print 'Final t:', t, '; t <= T is', t <= T
Final t: 2.5 ; t <= T is False
A list is used to collect a number of values or variables in an ordered sequence.
>>> mylist = [t, dt, T, 'mynumbers.dat', 100]
A list element can be any Python object, including numbers, strings,
functions, and other lists, for instance.
The table below shows some important list operations (only a subset of these are explained in the present document).
Construction | Meaning |
---|---|
a = [] | initialize an empty list |
a = [1, 4.4, 'run.py'] | initialize a list |
a.append(elem) | add elem object to the end |
a + [1,3] | add two lists |
a.insert(i, e) | insert element e before index i |
a[3] | index a list element |
a[-1] | get last list element |
a[1:3] | slice: copy data to sublist (here: index 1, 2) |
del a[3] | delete an element (index 3 ) |
a.remove(e) | remove an element with value e |
a.index('run.py') | find index corresponding to an element's value |
'run.py' in a | test if a value is contained in the list |
a.count(v) | count how many elements that have the value v |
len(a) | number of elements in list a |
min(a) | the smallest element in a |
max(a) | the largest element in a |
sum(a) | add all elements in a |
sorted(a) | return sorted version of list a |
reversed(a) | return reversed sorted version of list a |
b[3][0][2] | nested list indexing |
isinstance(a, list) | is True if a is a list |
type(a) is list | is True if a is a list |
If the list elements are also lists, we have a nested list. The following session summarizes indexing and loop traversal of nested lists:
>>> nl = [[0, 0, 1], [-1, -1, 2], [-10, 10, 5]]
>>> nl[0]
[0, 0, 1]
>>> nl[-1]
[-10, 10, 5]
>>> nl[0][2]
1
>>> nl[-1][0]
-10
>>> for p in nl:
... print p
...
[0, 0, 1]
[-1, -1, 2]
[-10, 10, 5]
>>> for a, b, c in nl:
... print '%3d %3d %3d' % (a, b, c)
...
0 0 1
-1 -1 2
-10 10 5
A tuple can be viewed as a constant list: no changes in the contents of the tuple is allowed. Tuples employ standard parentheses or no parentheses, and elements are separated with comma as in lists:
>>> mytuple = (t, dt, T, 'mynumbers.dat', 100)
>>> mytuple = t, dt, T, 'mynumbers.dat', 100
Many list operations are also valid for tuples, but those
that changes the list content cannot be used with tuples
(examples are
append
, del
, remove
, index
, and sort
).
An object a
containing an ordered collection of other objects such
that a[i]
refers to object with index i
in the collection,
is known as a
sequence in Python.
Lists, tuples, strings, and arrays
are examples on sequences.
You choose a sequence type when there is a natural ordering
of elements. For a collection of unordered objects
a dictionary
is often more convenient.
A for
loop is used to run through the elements of a list
or a tuple:
>>> for elem in [10, 20, 25, 27, 28.5]:
... print elem,
...
10 20 25 27 28.5
The trailing comma after the print
statement prevents the newline
character, which otherwise print
would automatically add.
The range
function is frequently used in for
loops over
a sequence of integers. Recall that range(start, stop, inc)
does not include the upper limit stop
among the list item.
>>> for elem in range(1, 5, 2):
... print elem,
...
1 3
>>> range(1, 5, 2)
[1, 3]
Implementation of a sum \( \sum_{j=M}^N q(j) \), where \( q(j) \) is some
mathematical expression involving the integer counter \( j \), is
normally implemented using a for
loop. Choosing, e.g., \( q(j) = 1/j^{2} \),
the sum is calculated by
s = 0 # accumulation variable
for j in range(M, N+1, 1):
s += 1./j**2
To print a list a
, print a
can be used, but the
pprint
and scitools.pprint2
modules and their
pprint
function give a nicer layout
of the output for long and nested lists. The scitools.pprint2
module
has the possibility to control the formatting of floating-point
numbers.
The important computer science terms in this document are
a[i:j]
while
loopfor
loopThe file src/misc/Oxford_sun_hours.txt contains data of the number of sun hours in Oxford, UK, for every month since January 1929. The data are already on a suitable nested list format:
[
[43.8, 60.5, 190.2, ...],
[49.9, 54.3, 109.7, ...],
[63.7, 72.0, 142.3, ...],
...
]
The list in every line holds the number of sun hours for each of the
year's 12 months. That is, the first index in the nested list
corresponds to year and the second index corresponds to the month
number. More precisely, the double index [i][j]
corresponds to year
1929 + i
and month 1 + j
(January being month number 1).
The task is to define this nested list in a program and do the following data analysis.
Initializing the data is easy: just copy the data from the
Oxford_sun_hours.txt
file into the program file and set a variable
name on the left hand side (the long and wide code is only indicated
here):
data = [
[43.8, 60.5, 190.2, ...],
[49.9, 54.3, 109.7, ...],
[63.7, 72.0, 142.3, ...],
...
]
For task 1, we need to establish a list monthly_mean
with the
results from the computation, i.e., monthly_mean[2]
holds the
average number of sun hours for March in the period 1929-2009. The
average is computed in the standard way: for each month, we run
through all the years, sum up the values, and finally divide by the
number of years (len(data)
or \( 2009-1929+1 \)).
monthly_mean = []
n = len(data) # no of years
for m in range(12): # counter for month indices
s = 0 # sum
for y in data: # loop over "rows" (first index) in data
s += y[m] # add value for month m
monthly_mean.append(s/n)
An alternative solution would be to introduce separate variables for
the monthly averages, say Jan_mean
, Feb_mean
, etc.
The reader should as an exercise write the code associated with
such a solution and realize that using the monthly_mean
list is more elegant and yields much simpler and shorter code.
Separate variables might be an okay solution for 2-3 variables, but
not for as many as 12.
Perhaps we want a nice-looking printout of the results.
This can elegantly be created by first defining a tuple (or list) of
the names of the months and then running through this list in parallel
with monthly_mean
:
month_names = 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',\
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
for name, value in zip(month_names, monthly_mean):
print '%s: %.1f' % (name, value)
The printout becomes
Jan: 56.6
Feb: 72.7
Mar: 116.5
Apr: 153.2
May: 191.1
Jun: 198.5
Jul: 193.8
Aug: 184.3
Sep: 138.3
Oct: 104.6
Nov: 67.4
Dec: 52.4
Task 2 can be solved by pure inspection of the above printout, which reveals
that June is the winner. However, since we are learning programming, we
should be able to replace our eyes with some computer code to automate
the task. The maximum value max_value
of a list like monthly_mean
is
simply obtained by max(monthly_mean)
.
The corresponding index, needed to find the right name of the corresponding
month, is found from monthly_mean.index(max_value)
.
The code for task 2 is then
max_value = max(monthly_mean)
month = month_names[monthly_mean.index(max_value)]
print '%s has best weather with %.1f sun hours on average' % \
(month, max_value)
Task 3 requires us to first develop an algorithm for how to compute the decade averages. The algorithm, expressed with words, goes as follows. We loop over the decades, and for each decade, we loop over its years, and for each year, we add the December data of the previous year and the January data of the current year to an accumulation variable. Dividing this accumulation variable by \( 10\cdot 2\cdot 30 \) gives the average number of sun hours per day in the winter time for the particular decade. The code segment below expresses this algorithm in the Python language:
decade_mean = []
for decade_start in range(1930, 2010, 10):
Jan_index = 0; Dec_index = 11 # indices
s = 0
for year in range(decade_start, decade_start+10):
y = year - 1929 # list index
print data[y-1][Dec_index] + data[y][Jan_index]
s += data[y-1][Dec_index] + data[y][Jan_index]
decade_mean.append(s/(20.*30))
for i in range(len(decade_mean)):
print 'Decade %d-%d: %.1f' % \
(1930+i*10, 1939+i*10, decade_mean[i])
The output becomes
Decade 1930-1939: 1.7
Decade 1940-1949: 1.8
Decade 1950-1959: 1.8
Decade 1960-1969: 1.8
Decade 1970-1979: 1.6
Decade 1980-1989: 2.0
Decade 1990-1999: 1.8
Decade 2000-2009: 2.1
The complete code is found in the file
sun_data.py.
The file Oxford_sun_hours.txt
is based on data from
the UK Met Office.
This document contains only fragments of the Python language. When doing your own projects or exercises you will certainly feel the need for looking up more detailed information on modules, objects, etc. Fortunately, there is a lot of excellent documentation on the Python programming language.
The primary reference is the official Python documentation website: docs.python.org
. Here you can
find a Python tutorial, the very useful Library Reference
[1], and a Language Reference, to mention
some key documents. When you wonder what functions you can find in a
module, say the math
module, you can go to the Library Reference
search for math, which quickly leads you to the official
documentation of the math
module. Alternatively, you can go to the
index of this document and pick the math (module)
item directly.
Similarly, if you want to look up more details of the printf
formatting syntax, go to the index and follow the printf-style
formatting index.
A word of caution is probably necessary here. Reference manuals are very technical and written primarily for experts, so it can be quite difficult for a newbie to understand the information. An important ability is to browse such manuals and dig out the key information you are looking for, without being annoyed by all the text you do not understand. As with programming, reading manuals efficiently requires a lot of training.
A tool somewhat similar to the Python Standard Library documentation is the
pydoc
program. In a terminal window you write
Terminal> pydoc math
In IPython there are two corresponding possibilities, either
In [1]: !pydoc math
or
In [2]: import math
In [3]: help(math)
The documentation of the complete math
module is shown as plain text.
If a specific function is wanted, we can ask for that directly,
e.g., pydoc math.tan
.
Since pydoc
is very fast, many prefer pydoc
over
web pages, but pydoc
has often less information compared to
the web documentation of modules.
There are also a large number of books about Python. Beazley [2] is an excellent reference that improves and extends the information in the web documentation. The Learning Python book [3] has been very popular for many years as an introduction to the language. There is a special web page listing most Python books on the market. Very few books target scientific computing with Python, but [4] gives an introduction to Python for mathematical applications and is more compact and advanced than the present book. It also serves as an excellent reference for the capabilities of Python in a scientific context. A comprehensive book on the use of Python for assisting and automating scientific work is [5].
Quick references, which list almost to all Python functionality in compact tabular form, are very handy. We recommend in particular the one by Richard Gruet [6].
The website http://www.python.org/doc/ contains a list of useful Python introductions and reference manuals.