Functions are widely used in programming and is a concept that needs to
be mastered. In the simplest case, a function in a program is much like
a mathematical function: some input number \( x \) is transformed to some output
number.
One example is the \( \tanh^{-1}(x) \) function, called atan
in computer
code: it takes one real number as input and returns another number.
Functions in Python are more general and can take a series of
variables as input and return one or more variables, or simply nothing.
The purpose of functions is two-fold:
If we modify the program ball.py
from the chapter A Python program with variables
slightly, and include a function, we could let this be a new program
ball_function.py
as
def y(t):
v0 = 5 # Initial velocity
g = 9.81 # Acceleration of gravity
return v0*t - 0.5*g*t**2
time = 0.6 # Just pick one point in time
print y(time)
time = 0.9 # Pick another point in time
print y(time)
When Python reads and interprets this program from the top, it takes
the code from the line with def
, to the line with return
, to be
the definition of a function with the name y
(note colon and
indentation). The return statement of the function y
, i.e.
return v0*t - 0.5*g*t**2
will be understood by Python as first compute the expression, then
send the result back (i.e., return) to where the function was called
from. Both def
and return
are reserved words. The function
depends on t
, i.e., one variable (or we say that it takes one argument or
input parameter), the value of which must be provided when the
function is called.
What actually happens when Python meets this code? The def
line
just tells Python that here is a function with name y
and it
has one argument t
. Python does not look into the function at
this stage (except that it checks the code for syntax errors).
When Python later on meets the statement print y(time)
, it
recognizes a function call y(time)
and recalls that there is a
function y
defined with one argument. The value of time
is
then transferred to the y(t)
function such that t = time
becomes the first action in the y
function. Then Python
executes one line at a time in the y
function.
In the final line, the arithmetic expression v0*t - 0.5*g*t**2
is computed, resulting in a number, and this number (or more
precisely, the Python object representing the number) replaces
the call y(time)
in the calling code such that the word print
now
precedes a number rather than a function call.
Python proceeds with the next line and sets time
to a new value.
The next print
statement triggers a new call to y(t)
, this time
t
is set to 0.9
, the computations are done line by line in the
y
function, and the returned result replaces y(time)
.
Instead of writing print y(time)
, we could
alternatively have stored the returned result from the y
function
in a variable,
h = y(time)
print h
Note that when a function contains if-elif-else
constructions,
return
may be done from within any of the branches. This may be
illustrated by the following function containing three return
statements:
def check_sign(x):
if x > 0:
return 'x is positive'
elif x < 0:
return 'x is negative'
else:
return 'x is zero'
Remember that only one of the branches is executed for a single call
on check_sign
, so depending on the number x
, the return may take
place from any of the three return alternatives.
return
inside a function where you want, or if there should only be
one single return
statement at the end of the function.
The authors of this book emphasize readable code and
think that return
can be useful in branches as in the example above
when the function is short. For longer or more complicated functions,
it might be better to have one single return
statement.
Be prepared for critical comments if you return wherever you want...
An expression you will often encounter when dealing with programming, is main
program, or that some code is in main. This is nothing particular
to Python, and simply refers to that part of the program which is
outside functions. However, note that the def
line of functions is
counted into main. So, in ball_function.py
above, all
statements outside the function y
are in main, and also the line
def y(t):
.
A function may take no arguments, or many, in which case they are just
listed within the parentheses (following the function name) and
separated by a comma. Let us illustrate. Take a slight variation of the
ball example and assume that the ball is not thrown straight up, but
at an angle, so that two coordinates are needed to specify its
position at any time. According to Newton's laws (when air resistance
is negligible), the vertical position is given by \( y(t) = v_{0y}t - 0.5gt^2 \) and
the horizontal position by \( x(t) = v_{0x}t \). We can include both these
expressions in a new version of our program that prints the position
of the ball for chosen times. Assume we want to evaluate these expressions at two
points in time, \( t = 0.6s \) and \( t = 0.9s \). We can pick some numbers
for the initial velocity components v0y
and v0x
, name the program
ball_position.py,
and write it for example as
def y(v0y, t):
g = 9.81 # Acceleration of gravity
return v0y*t - 0.5*g*t**2
def x(v0x, t):
return v0x*t
initial_velocity_x = 2.0
initial_velocity_y = 5.0
time = 0.6 # Just pick one point in time
print x(initial_velocity_x, time), y(initial_velocity_y, time)
time = 0.9 # ... Pick another point in time
print x(initial_velocity_x, time), y(initial_velocity_y, time)
Now we compute and print the two components for the position, for each of the two chosen points in time. Notice how each of the two functions now takes two arguments. Running the program gives the output
1.2 1.2342
1.8 0.52695
A function may also have no return value, in which case we simply drop the return statement, or it may return more than one value. For example, the two functions we just defined could alternatively have been written as one:
def xy(v0x, v0y, t):
g = 9.81 # acceleration of gravity
return v0x*t, v0y*t - 0.5*g*t**2
Notice the two return values which are simply separated by a comma. When calling the function (and printing), arguments must appear in the same order as in the function definition. We would then write
print xy(initial_x_velocity, initial_y_velocity, time)
The two returned values from the function could alternatively have been assigned to variables, e.g., as
x_pos, y_pos = xy(initial_x_velocity, initial_y_velocity, time)
The variables x_pos
and y_pos
could then have been printed or used
in other ways in the code.
There are possibilities for having a variable number of function input
and output parameters (using *args
and **kwargs
constructions
for the arguments). However, we do not go further into that topic here.
Variables that are defined inside a function, e.g., g
in the last
xy
function, are local variables. This means they are only known
inside the function. Therefore, if you had accidentally used g
in
some calculation outside the function, you would have got an error
message. The variable time
is defined outside the function and is
therefore a global variable. It is known both outside and inside the
function(s). If you define one global and one local variable, both
with the same name, the function only sees the local one, so the
global variable is not affected by what happens with the local
variable of the same name.
The arguments named in the heading of a function definition are by
rule local variables inside the function. If you want to change the
value of a global variable inside a function, you need to declare the
variable as global inside the function. That is, if the global
variable was x
, we would need to write global x
inside the
function definition before we let the function change it. After function
execution, x
would then have a changed value. One should
strive to define variables mostly where they are needed and not
everywhere.
Another very useful way of handling function parameters in Python, is by defining parameters as keyword arguments. This gives default values to parameters and allows more freedom in function calls, since the order and number of parameters may vary.
Let us illustrate the use of keyword arguments with the function
xy
. Assume we defined xy
as
def xy(t, v0x=0, v0y=0):
g = 9.81 # acceleration of gravity
return v0x*t, v0y*t - 0.5*g*t**2
Here, t
is an ordinary or positional argument, whereas v0x
and
v0y
are keyword arguments or named arguments. Generally, there
can be many positional arguments and many keyword arguments, but the
positional arguments must always be listed before the keyword
arguments in function definition. Keyword arguments are given default
values, as shown here with v0x
and v0y
, both having zero as default
value. In a script, the function xy
may now be called in many
different ways. For example,
print xy(0.6)
would make xy
perform the computations with t = 0.6
and the default
values (i.e zero) of v0x
and v0y
. The two numbers returned from
xy
are printed to the screen. If we wanted to use another initial
value for v0y
, we could, e.g., write
print xy(0.6,v0y=4.0)
which would make xy
perform the calculations with t = 0.6
, v0x = 0
(i.e. the default value) and v0y = 4.0
. When there are several
positional arguments, they have to appear in the same order as defined
in the function definition, unless we explicitly use the names of
these also in the function call. With explicit name specification in
the call, any order of parameters is acceptable. To illustrate, we
could, e.g., call xy
as
print xy(v0y=4.0, v0x=1.0, t=0.6)
In any programming language, it is a good habit to include a little
explanation of what the function is doing, unless what is done by the
function is obvious, e.g., when having only a few simple code lines. This
explanation is called a doc string, which in Python should be placed just at the top of the function.
This explanation is meant for a human who wants
to understand the code, so it should say something about the purpose
of the code and possibly explain the arguments and return values if
needed. If we do that with our xy
function from above, we may write
the first lines of the function as
def xy(v0x, v0y, t):
"""Compute the x and y position of the ball at time t"""
Note that other functions may be called from within other functions, and function input parameters are not required to be numbers. Any object will do, e.g., string variables or other functions.
Functions are straightforwardly passed as arguments to other functions, as illustrated by the following script function_as_argument.py:
def sum_xy(x, y):
return x + y
def prod_xy(x, y):
return x*y
def treat_xy(f, x, y):
return f(x, y)
x = 2; y = 3
print treat_xy(sum_xy, x, y)
print treat_xy(prod_xy, x, y)
When run, this program first prints the sum of x
and y
(i.e., 5),
and then it prints the product (i.e., 6). We see that treat_xy
takes
a function name as its first parameter. Inside treat_xy
, that
function is used to actually call the function that was given as
input parameter. Therefore, as shown, we may call treat_xy
with
either sum_xy
or prod_xy
, depending on whether we want the sum or
product of x
and y
to be calculated.
Functions may also be defined within other functions. It that case, they become local functions, or nested functions, known only to the function inside which they are defined. Functions defined in main are referred to as global functions. A nested function has full access to all variables in the parent function, i.e. the function within which it is defined.
Short functions can be defined in a compact way, using what is known as a lambda function:
f = lambda x, y: x + 2*y
# Equivalent
def f(x, y):
return x + 2*y
The syntax consists of lambda
followed by a series of arguments, colon,
and some Python expression resulting in an object to be returned from
the function. Lambda functions are particularly convenient as
function arguments:
print treat_xy(lambda x, y: x*y, x, y)
Here is a little example in IPython where we calculate the CPU time for doing array computations with and without a helper function:
In [1]: import numpy as np
In [2]: a = np.zeros(1000000)
In [3]: def add(a, b):
...: return a + b
...:
In [4]: %timeit for i in range(len(a)): a[i] = add(i, i+1)
The slowest run took 16.01 times longer than the fastest.
This could mean that an intermediate result is being cached
1 loops, best of 3: 178 ms per loop
In [5]: %timeit for i in range(len(a)): a[i] = i + (i+1)
10 loops, best of 3: 109 ms per loop
We notice that there is some overhead in function calls. The impact of the overhead reduces quickly with the amount of computational work inside the function.