Variables, loops, lists, and arrays¶
Getting access to Python¶
Simple mathematical calculations can be done in plain Python, but for more advanced scientific computing, as we do here, several add-on packages are needed. Getting all software correctly installed used to be quite challenging, but today there are several easy-to-use approaches to get access to Python.
Mac and Windows¶
We recommend to download and install Anaconda, which is a free distribution of Python that comes with most of the packages you need for advanced scientific computing.
Ubuntu¶
Debian-based Linux systems, such as Ubuntu, can also use Anaconda,
but it is more common to use the apt-get
command-line tool or the
Ubuntu installer to install a set of Debian packages. Here is a
list of the packages you need to install for this introduction:
Terminal> sudo apt-get install python-pip python-setuptools \
python-scipy python-sympy python-cython python-matplotlib \
python-dev python-profiler pydb spyder imagemagick gedit vim \
emacs python-mode git mercurial lib-avtools gnome-terminal
In addition, run
Terminal> pip install nose
Terminal> pip install pytest
Terminal> pip install ipython --upgrade
Terminal> pip install tornado --upgrade
Terminal> pip install pyzmq --upgrade
Web access¶
You can also access Python directly through a web browser without having it installed on your local computer. We refer to the document How to access Python for doing scientific computing for more details on such tools and also installations based on Anaconda and Ubuntu.
Mathematical example¶
We shall use a famous mathematical formula in our forthcoming programming examples. The formula tells how long distance \(s\) an object has moved in a time interval \([0,t]\) if it starts out with a velocity \(v_0\) and undergoes constant acceleration \(a\):
We may view \(s\) as a function of \(t\): \(s(t)\), and also include the parameters in the notation: \(s(t;v_0,a)\).
A program for evaluating a formula¶
Here is a Python program for computing \(s\), given \(t=0.5\), \(v_0=2\), and \(a=0.2\):
t = 0.5
v0 = 2
a = 0.2
s = v0*t + 0.5*a*t**2
print s
The program is pure text and must be placed in a pure text file using
a text editor. Popular text editors are
Gedit, Nano, Emacs, and Vim on Linux,
TextWrangler on Mac OS X, and Notepad++ on Windows.
Save the text to a
program file whose name ends in .py
, say distance.py
.
The program is run in a terminal window (Command Prompt on Windows,
Terminal application on Mac OS X, xterm
or gnome-terminal
on Linux):
Terminal> python distance.py
1.025
The result of the print
statement is the number 1.025
in the terminal
window.
As an alternative to writing programs in a text editor and executing them in a terminal window, you may use the Spyder graphical interface which gives a more Matlab-style environment to work in. Anaconda installations come with Spyder.
The distance.py
program first contains four assignment statements where we
assign numbers or the results of arithmetic expressions to variables.
The variables have names coinciding with the mathematical notation
used in the formula: t
, v0
, a
, and s
. You may think of
variables in this programs just as variables in mathematics.
More technical details
A statement like t = 0.5
works as follows. First, the right-hand side
is interpreted by Python as a real number and a float
object containing
the value 0.5 is created. Then the name t
is defined as a reference
to this object.
In the statement s = v0*t + 0.5*a*t**2
, Python will first go to the
right-hand side and observe that it is an arithmetic expression
involving variables with known values (or names referring to existing
objects). The arithmetic expression is calculated, resulting in a
real number that is saved as a float
object with value 1.025 in
the computer’s memory.
Everything in Python is an object of some type. Here, t
, a
, and s
are float
objects, representing real (floating-point) numbers, while
v0
is an int
object, representing the integer 2.
There are many other types of objects: strings, lists, tuples,
dictionaries, arrays, files, ...
Formatted output with text and numbers¶
For any object s
in Python, print s
will (normally) print the contents
of s
. However, sometimes we want to combine the content of an object
with some text. Say we want to print s=1.025
rather than just the number.
This is easily accomplished using printf syntax:
print 's=%g' % s
The output is specified as a string, enclosed in single or double quotes.
Inside the string, there can be “slots” for numbers (or other
objects), indicated by a percentage sign followed by a specification of
kind of data that will be inserted at this place. In the string above,
there is one such slot, %g
, where the g
implies a real number written
as compactly as possible.
It is easy to control the number of decimals using printf syntax.
Printing out s=1.03
, i.e., s
with two decimals, is done by
print 's=%.2f' % s
where the f
signifies a decimal number and the preceding .2
means
2 decimals. Scientific notation, as in s=1.03E+00
(\(1.03\cdot 10^0\)),
is specified as %.2E
(2 decimals).
The printf syntax is available in numerous programming languages. Python also offers a related variant, called format string syntax:
print 's={s:.2f}'.format(s=s)
While loops¶
Suppose we want to make a table with two columns, one with \(t\) values and one with the corresponding \(s\) values. Now we have to repeat a lot of calculations with the formula (1). This is easily done with a loop. There are two types of loops in Python: while loops and for loops.
Let the \(t\) values go from 0 to 2 in increments of 0.1. The following program applies a while loop:
v0 = 2
a = 0.2
dt = 0.1 # Increment
t = 0 # Start value
while t <= 2:
s = v0*t + 0.5*a*t**2
print t, s
t = t + dt
The result of running this program in a terminal window is
Terminal> python while.py
0 0.0
0.1 0.201
0.2 0.404
0.3 0.609
0.4 0.816
0.5 1.025
0.6 1.236
0.7 1.449
0.8 1.664
0.9 1.881
1.0 2.1
1.1 2.321
1.2 2.544
1.3 2.769
1.4 2.996
1.5 3.225
1.6 3.456
1.7 3.689
1.8 3.924
1.9 4.161
So, how do we interpret the contents of this program? First we initialize four
variables: v0
, a
, dt
, and t
. Everything after #
on
a line is a comment and does not affect what happens in the program, but
is meant to be of help for a human reading the program.
Then comes the while loop:
while condition:
<intented statement>
<intented statement>
<intented statement>
Observe the colon at the end of the while
line.
The set of indented statements are repeated as long as the expression or
variable condition
evaluates to True
. In the present case, the condition
is the boolean expression t <= 2
, so as long as the value is less
than or equal to 2,
t <= 2
evaluates to True
, otherwise it evaluates to False
.
Error in the code
According to the code, the last pass in the while loop should correspond
to \(t=2\), but looking at the output, we see that the last print
statement has \(t=1.9\). The next test in the while condition involves
\(t=2\) and the boolean condition is expected to be 2 == 2
. However,
it seems that the condition is False
since the computations for
\(t=2\) are not printed. Why do we experience this behavior?
The Python Online Tutor is a great tool to examine the program flow. Consider this little loop run in the Python Online Tutor:
The Python Online Tutor executes each statement and displays the contents of variables such that you can track the program flow and the evolution of variables.
So, why is not a
printed for 1.2? That is, why does a == 1.2
evaluate to
True
when a
is (expected to be) 2? We must look at the accuracy of
a
to investigate this question and write it out with 16 decimals in
scientific notation (printf specification %.16E
):
The problem is that da
is not exactly 0.4, but contains a small round-off
error. Doing a = a + da
then results in a slightly inaccurate a
.
When the exact a
should reach 1.2, the a
in the program has a
an error and equals 1.2000000000000002
. Obviously the test for
exact equality, a == 1.2
, becomes False
, and the loop is terminated.
Rule: use exact equality ==
when comparing real numbers
Always compare real numbers using the difference and a tolerance:
a = 1.2
b = 0.4 + 0.4 + 0.4
boolean_condition1 = a == b # False
tol = 1E-14
boolean_condition2 = abs(a - b) < tol # True
The Python Online Tutor is ideal for demonstrating program flow and contents of variables in small and simple programs. However, you should use a real debugger instead when searching for errors in real programs.
Lists¶
The table created in the previous section has two columns of data. We could store all the numbers in each column in a list object. A list is just a collection of objects, here numbers, in a given sequence. For example,
L = [-1, 1, 8.0]
is a list of three numbers. Lists are enclosed in square brackets and may contain any type of objects separated by commas. Here we mix a filename (string), a real number, and an integer:
L = ['mydata.txt', 3.14, 10]
The different list elements can be reached via indexing: L[0]
is
the first element, L[1]
is the second, and L[-1]
is the last element.
Here is an interactive Python shell where we can write Python statements
and examine the contents of variables as we perform various operations:
>>> L = ['mydata.txt', 3.14, 10]
>>> print L[0]
mydata.txt
>>> print L[1]
3.14
>>> del L[0] # delete the first element
>>> print L
[3.14, 10]
>>> print len(L) # length of L
2
>>> L.append(-1) # add -1 at the end of the list
>>> print L
[3.14, 10, -1]
Let us store the numbers in the previous table in lists, one for each
column. We can start with empty lists []
and use append
to
add a new element to the lists in each pass of the while loop. Thereafter,
we can run a new while loop and print the contents of the lists in a
nice, tabular fashion:
v0 = 2
a = 0.2
dt = 0.1 # Increment
t = 0
t_values = []
s_values = []
while t <= 2:
s = v0*t + 0.5*a*t**2
t_values.append(t)
s_values.append(s)
t = t + dt
print s_values # Just take a look at a created list
# Print a nicely formatted table
i = 0
while i <= len(t_values)-1:
print '%.2f %.4f' % (t_values[i], s_values[i])
i += 1 # Same as i = i + 1
The output looks like
[0.0, 0.201, 0.404, 0.6090000000000001, 0.8160000000000001,
1.025, 1.236, 1.4489999999999998, 1.664, 1.8809999999999998,
2.0999999999999996, 2.3209999999999997, ...]
0.00 0.0000
0.10 0.2010
0.20 0.4040
0.30 0.6090
0.40 0.8160
0.50 1.0250
0.60 1.2360
0.70 1.4490
...
Note that print s_values
here leads to output with many
decimals and small round-off errors. To get complete control of the
formatting of real numbers in the table, we use the printf
syntax.
Lists come with a lot of functionality. See the Python Tutorial for many more examples.
For loops¶
A for loop is used for visiting elements in a list, one by one:
>>> L = [1, 4, 8, 9]
>>> for e in L:
... print e
...
1
4
8
9
The variable e
is successively set equal to the elements in the list,
in the right order. Note the colon at the end of the for
line.
The statements to be executed in each pass of the loop (here only
print e
) must be indented. When e
is set equal to the last element and
all indented statements are executed, the loop is over, and the program
flow continues with the next statement that is not indented.
Try the following code out in the Python Online Tutor:
A for loop over the valid indices in a list is created by
for i in range(len(somelist)):
# Work with somelist[i]
The range
function returns a list of integers: range(a, b, s)
returns the integers a, a+s, a+2*s, ...
up to but not including b
.
Just writing range(b)
implies a=0
and s=1
, so
range(len(somelist))
returns [0, 1, 2]
.
For loops over real numbers
The for i in range(...)
construction can only run a loop over integers.
If you need a loop over real values, you have to either create a list
with the values first, or use a formula where each value is generated
through an integer counter:
# Need loop over 0, 0.1, 0.2, ..., 1
values = []
for i in range(11):
values.append(i*0.1)
# Shorter version using list comprehension (same as the loop above)
values = [i*0.1 for i in range(11)]
for value in values:
# work with value
We can now rewrite our program that used lists and while loops to use for loops instead:
v0 = 2
a = 0.2
dt = 0.1 # Increment
t_values = []
s_values = []
n = int(round(2/dt)) + 1 # No of t values
for i in range(n):
t = i*dt
s = v0*t + 0.5*a*t**2
t_values.append(t)
s_values.append(s)
print s_values # Just take a look at a created list
# Make nicely formatted table
for t, s in zip(t_values, s_values):
print '%.2f %.4f' % (t, s)
# Alternative implementation
for i in range(len(t_values)):
print '%.2f %.4f' % (t_values[i], s_values[i])
Observe that we have here used a slightly different technique
for computing the \(t\)
values inside the first loop: now we set \(t\) as \(i\Delta t\), where
\(\Delta t\) (dt
in the code) is the increment (0.1) between each
\(t\) value. The computation of n
, the number of \(t\) values,
makes use of round
to make a correct mathematical rounding to the nearest
integer (and int
makes an integer object out of the rounded real number).
(In an interval \([a,b]\) divided into subintervals of equal length \(\Delta t\),
there will be \(1 + (b-a)/\Delta t\) points in total.)
Running through multiple lists simultaneously is done with the zip
construction:
for e1, e2, e3, ... in zip(list1, list2, list3, ...):
One may instead create a for loop over all the legal index values instead and index each array,
for i in range(len(list1)):
e1 = list1[i]
e2 = list2[i]
...
Arrays¶
Lists are useful for collecting a set of numbers or other objects in a single variable. Arrays are much like lists, but tailored for collection of numbers. The primary advantage of arrays is that you can use them very efficiently and conveniently in mathematical computations, but the downside is that an array has (in practice) a fixed length and all elements must be of the same type. This is usually no important restriction in scientific computations.
Much of Python’s functionality are coded in modules. To use such
functionality, one must import the relevant modules. Arrays, for
example, are available in the numpy
module, which must be imported.
Functions or variables in the module must be prefixed by numpy
.
This will be demonstrated below.
Here is an example involving basic operations with arrays:
>>> L = [1, 4, 10.0] # List of numbers
>>> import numpy
>>> a = numpy.array(L) # Make corresponding array
>>> print a
[ 1. 4. 10.]
>>> print a[1]
4.0
>>> print a.dtype # Data type of an element
float64
>>> b = 2*a + 1
>>> print b
[ 3. 9. 21.]
Note that all elements in the a
array are of float
type
(because one element in L
was float
).
Arithmetic expressions such as 2*a+1
work with a
as array, but not
as list. In fact, we can pass arrays to mathematical functions:
>>> c = numpy.log(a)
>>> print c
[ 0. 1.38629436 2.30258509]
The numpy
module has a lot of very useful utilities. To create
\(n+1\) uniformly distributed
coordinates in an interval \([a,b]\), stored in an array, one
can use linspace
:
t = numpy.linspace(a, b, n+1)
This construction makes it easy to create arrays for the \(t\) and \(s\) values in our tables:
import numpy
v0 = 2
a = 0.2
dt = 0.1 # Increment
n = int(round(2/dt)) + 1 # No of t values
t_values = numpy.linspace(0, 2, n+1)
s_values = v0*t + 0.5*a*t**2
# Make nicely formatted table
for t, s in zip(t_values, s_values):
print '%.2f %.4f' % (t, s)
Mathematical functions¶
Python offers access to all standard mathematical functions such as
\(\sin x\), \(\cos x\), \(\tan x\), \(\sinh x\), \(\cosh x\), \(\tanh x\),
all their inverses (called asin(x)
, asinh(x)
, and so forth), \(e^x\)
(exp(x)
), \(\ln x\) (log(x)
), and \(x!\) (factorial(x)
).
However, one has to import a module to get access to these functions.
For scalars (single numbers) the relevant module is math
:
>>> import math
>>> print math.sin(math.pi)
1.2246467991473532e-16
which shows that the sine function is only approximate (to 16 digits).
Many prefer to write mathematical expressions without the math
prefix:
from math import sin, pi
print sin(pi)
# Or import everything from math
from math import *
print sin(pi), log(e), tanh(0.5)
The numpy
module contains sine, cosine, and other mathematical functions
that work on scalars as well as arrays.
Import of numpy
To get Python code that is as similar to Matlab as possible, one would do a “start import” of everything,
from numpy import *
x = linspace(0, 1, 101)
y = exp(-x)*sin(pi*x)
However, in the Python community it has been a culture to use
a prefix np
as abbreviation for numpy
:
import numpy as np
x = np.linspace(0, 1, 101)
y = np.exp(-x)*np.sin(np.pi*x)
Our convention is to use the np
prefix for typical Matlab functions,
but skip the prefix when working with mathematical functions like
`exp(x)*sin(pi*x)`to get a one-to-one correspondence between formulas
in the program and in the mathematical description of the problem.
import numpy as np
from numpy import sin, exp, pi
x = np.linspace(0, 1, 101)
y = exp(-x)*sin(pi*x)
Plotting¶
We can easily make a graph of a function \(s(t)\) using the module
matplotlib
. The technique is to compute an array of \(t\) values
and a corresponding array of function values \(s(t)\). Plotting
programs will draw straight lines between the points on the curve,
so a sufficient number of points are needed to give the impression
of a smooth curve. Our \(s(t)\) function is plotted by the following
code:
import numpy as np
import matplotlib.pyplot as plt
v0 = 0.2
a = 2
n = 21 # No of t values for plotting
t = np.linspace(0, 2, n+1)
s = v0*t + 0.5*a*t**2
plt.plot(t, s)
plt.savefig('myplot.png')
plt.show()
The plotfile myplot.png
looks like
Matlab users may prefer to do
from numpy import *
from matplotlib.pyplot import *
such that they can use linspace
and plot
without any prefix, just as
in Matlab.
Two curves can easily be plotted, this time also with labels on the axis and a box with legends such that we can distinguish the two curves:
import numpy as np
import matplotlib.pyplot as plt
v0 = 0.2
a = 2
n = 21 # No of t values for plotting
t = np.linspace(0, 2, n+1)
s = v0*t + 0.5*a*t**2
plt.plot(t, s)
plt.savefig('myplot.png')
plt.show()