We have seen that a group of numbers may be stored in an array that we may treat as a whole, or element by element. In Python, there is another way of organizing data that actually is much used, at least in non-numerical contexts, and that is a construction called list.
A list is quite similar to an array in many ways, but there are pros and cons to consider. For example, the number of elements in a list is allowed to change, whereas arrays have a fixed length that must be known at the time of memory allocation. Elements in a list can be of different type, i.e you may mix integers, floats and strings, whereas elements in an array must be of the same type. In general, lists provide more flexibility than do arrays. On the other hand, arrays give faster computations than lists, making arrays the prime choice unless the flexibility of lists is needed. Arrays also require less memory use and there is a lot of ready-made code for various mathematical operations. Vectorization requires arrays to be used.
The range()
function that we used above in our for
loop actually
returns a list. If you for example write range(5)
at the prompt in
ipython
, you get [0, 1, 2, 3, 4]
in return, i.e., a list with 5
numbers. In a for loop, the line for i in range[5]
makes i
take on
each of the numbers \( 0, 1, 2, 3, 4 \) in turn, as we saw above. Writing,
e.g., x = range(5)
, gives a list by the name x
, containing those
five numbers. These numbers may now be accessed (e.g., as x[2]
,
which contains the number 2) and used in computations just as we saw
for array elements. As with arrays, indices run from \( 0 \) to \( n - 1 \),
when n
is the number of elements in a list. You may convert a list
to an array by x = array(L)
.
A list may also be created by simply writing, e.g.,
x = ['hello', 4, 3.14, 6]
giving a list where x[0]
contains the string hello
, x[1]
contains the integer 4
, etc. We may add and/or delete elements
anywhere in the list as shown in the following example.
x = ['hello', 4, 3.14, 6]
x.insert(0, -2) # x then becomes [-2, 'hello', 4, 3.14, 6]
del x[3] # x then becomes [-2, 'hello', 4, 6]
x.append(3.14) # x then becomes [-2, 'hello', 4, 6, 3.14]
Note the ways of writing the different operations here. Using
append()
will always increase the list at the end. If you like, you
may create an empty list as x = []
before you enter a loop which
appends element by element. If you need to know the length of the
list, you get the number of elements from len(x)
, which in our case
is 5, after appending 3.14
above. This function is handy if you want
to traverse all list elements by index, since range(len(x))
gives
you all legal indices. Note that there are many more operations on
lists possible than shown here.
Previously, we saw how a for
loop may run over array elements. When
we want to do the same with a list in Python, we may do it as this
little example shows,
x = ['hello', 4, 3.14, 6]
for e in x:
print 'x element: ', e
print 'This was all the elements in the list x'
This is how it usually is done in Python, and we see that e
runs
over the elements of x
directly, avoiding the need for indexing. Be
aware, however, that when loops are written like this, you can not
change any element in x
by "changing" e. That is, writing e += 2
will not change anything in x
, since e
can only be used to read
(as opposed to overwrite) the list elements.
There is a special construct in Python that allows you to run through all elements of a list, do the same operation on each, and store the new elements in another list. It is referred to as list comprehension and may be demonstrated as follows.
List_1 = [1, 2, 3, 4]
List_2 = [e*10 for e in List_1]
This will produce a new list by the name List_2
, containing the
elements 10
, 20
, 30
and 40
, in that order. Notice the
syntax within the brackets for List_2
, for e in List_1
signals
that e
is to successively be each of the list elements in List_1
,
and for each e
, create the next element in List_2
by doing
e*10
. More generally, the syntax may be written as
List_2 = [E(e) for e in List_1]
where E(e)
means some expression involving e
.
In some cases, it is required to run through 2 (or more) lists at the same time.
Python has a handy function called zip
for this purpose. An example of how to use
zip
is provided in the code file_handling.py
below.
We should also briefly mention about tuples, which are very much like lists, the main difference being that tuples cannot be changed. To a freshman, it may seem strange that such "constant lists" could ever be preferable over lists. However, the property of being constant is a good safeguard against unintentional changes. Also, it is quicker for Python to handle data in a tuple than in a list, which contributes to faster code. With the data from above, we may create a tuple and print the content by writing
x = ('hello', 4, 3.14, 6)
for e in x:
print 'x element: ', e
print 'This was all the elements in the tuple x'
Trying insert
or append
for the tuple gives an error message (because it cannot
be changed), stating that the tuple object has no such attribute.
Input data for a program often come from files and the results of the computations are often written to file. To illustrate basic file handling, we consider an example where we read \( x \) and \( y \) coordinates from two columns in a file, apply a function \( f \) to the \( y \) coordinates, and write the results to a new two-column data file. The first line of the input file is a heading that we can just skip:
# x and y coordinates
1.0 3.44
2.0 4.8
3.5 6.61
4.0 5.0
The relevant Python lines for reading the numbers and writing out
a similar file are given in the file
file_handling.py
filename = 'tmp.dat'
infile = open(filename, 'r') # Open file for reading
line = infile.readline() # Read first line
# Read x and y coordinates from the file and store in lists
x = []
y = []
for line in infile:
words = line.split() # Split line into words
x.append(float(words[0]))
y.append(float(words[1]))
infile.close()
# Transform y coordinates
from math import log
def f(y):
return log(y)
for i in range(len(y)):
y[i] = f(y[i])
# Write out x and y to a two-column file
filename = 'tmp_out.dat'
outfile = open(filename, 'w') # Open file for writing
outfile.write('# x and y coordinates\n')
for xi, yi in zip(x, y):
outfile.write('%10.5f %10.5f\n' % (xi, yi))
outfile.close()
Such a file with a comment line and numbers in tabular format is very
common so numpy
has functionality to ease reading and writing.
Here is the same example using the loadtxt
and savetxt
functions
in numpy
for tabular data (file
file_handling_numpy.py):
filename = 'tmp.dat'
import numpy
data = numpy.loadtxt(filename, comments='#')
x = data[:,0]
y = data[:,1]
data[:,1] = numpy.log(y) # insert transformed y back in array
filename = 'tmp_out.dat'
filename = 'tmp_out.dat'
outfile = open(filename, 'w') # open file for writing
outfile.write('# x and y coordinates\n')
numpy.savetxt(outfile, data, fmt='%10.5f')