Functions

Functions are widely used in programming and is a concept that needs to be mastered. In the simplest case, a function in a program is much like a mathematical function: some input number $x$ is transformed to some output number. One example is the $\tanh^{-1}(x)$ function, called atan in computer code: it takes one real number as input and returns another number. Functions in Python are more general and can take a series of variables as input and return one or more variables, or simply nothing. The purpose of functions is two-fold:

to group statements into separate units of code lines that naturally belong together (a strategy which may dramatically ease the problem solving process), and/or
to parameterize a set of statements such that they can be written only once and easily be re-executed with variations.

Examples will be given to illustrate how functions can be written in various contexts.

If we modify the program ball.py from the chapter A Python program with variables slightly, and include a function, we could let this be a new program ball_function.py as

def y(t):
    v0 = 5                    # Initial velocity
    g = 9.81                  # Acceleration of gravity
    return v0*t - 0.5*g*t**2

time = 0.6       # Just pick one point in time
print y(time)
time = 0.9       # Pick another point in time
print y(time)

When Python reads and interprets this program from the top, it takes the code from the line with def, to the line with return, to be the definition of a function with the name y (note colon and indentation). The return statement of the function y, i.e.

return v0*t - 0.5*g*t**2

will be understood by Python as first compute the expression, then send the result back (i.e., return) to where the function was called from. Both def and return are reserved words. The function depends on t, i.e., one variable (or we say that it takes one argument or input parameter), the value of which must be provided when the function is called.

What actually happens when Python meets this code? The def line just tells Python that here is a function with name y and it has one argument t. Python does not look into the function at this stage (except that it checks the code for syntax errors). When Python later on meets the statement print y(time), it recognizes a function call y(time) and recalls that there is a function y defined with one argument. The value of time is then transferred to the y(t) function such that t = time becomes the first action in the y function. Then Python executes one line at a time in the y function. In the final line, the arithmetic expression v0*t - 0.5*g*t**2 is computed, resulting in a number, and this number (or more precisely, the Python object representing the number) replaces the call y(time) in the calling code such that the word print now precedes a number rather than a function call.

Python proceeds with the next line and sets time to a new value. The next print statement triggers a new call to y(t), this time t is set to 0.9, the computations are done line by line in the y function, and the returned result replaces y(time). Instead of writing print y(time), we could alternatively have stored the returned result from the y function in a variable,

h = y(time)
print h

Note that when a function contains if-elif-else constructions, return may be done from within any of the branches. This may be illustrated by the following function containing three return statements:

def check_sign(x):
    if x > 0:
        return 'x is positive'
    elif x < 0:
        return 'x is negative'
    else:
        return 'x is zero'

Remember that only one of the branches is executed for a single call on check_sign, so depending on the number x, the return may take place from any of the three return alternatives.

To return at the end or not. Programmers disagree whether it is a good idea to use return inside a function where you want, or if there should only be one single return statement at the end of the function. The authors of this book emphasize readable code and think that return can be useful in branches as in the example above when the function is short. For longer or more complicated functions, it might be better to have one single return statement. Be prepared for critical comments if you return wherever you want...

An expression you will often encounter when dealing with programming, is main program, or that some code is in main. This is nothing particular to Python, and simply refers to that part of the program which is outside functions. However, note that the def line of functions is counted into main. So, in ball_function.py above, all statements outside the function y are in main, and also the line def y(t):.

A function may take no arguments, or many, in which case they are just listed within the parentheses (following the function name) and separated by a comma. Let us illustrate. Take a slight variation of the ball example and assume that the ball is not thrown straight up, but at an angle, so that two coordinates are needed to specify its position at any time. According to Newton's laws (when air resistance is negligible), the vertical position is given by $y(t) = v_{0y}t - 0.5gt^2$ and the horizontal position by $x(t) = v_{0x}t$ . We can include both these expressions in a new version of our program that prints the position of the ball for chosen times. Assume we want to evaluate these expressions at two points in time, $t = 0.6s$ and $t = 0.9s$ . We can pick some numbers for the initial velocity components v0y and v0x, name the program ball_position.py, and write it for example as

def y(v0y, t):
    g = 9.81                  # Acceleration of gravity
    return v0y*t - 0.5*g*t**2

def x(v0x, t):
    return v0x*t

initial_velocity_x = 2.0
initial_velocity_y = 5.0

time = 0.6       # Just pick one point in time
print x(initial_velocity_x, time), y(initial_velocity_y, time)
time = 0.9       # ... Pick another point in time
print x(initial_velocity_x, time), y(initial_velocity_y, time)

Now we compute and print the two components for the position, for each of the two chosen points in time. Notice how each of the two functions now takes two arguments. Running the program gives the output

1.2  1.2342
1.8  0.52695

A function may also have no return value, in which case we simply drop the return statement, or it may return more than one value. For example, the two functions we just defined could alternatively have been written as one:

def xy(v0x, v0y, t):
    g = 9.81		              # acceleration of gravity
    return v0x*t, v0y*t - 0.5*g*t**2

Notice the two return values which are simply separated by a comma. When calling the function (and printing), arguments must appear in the same order as in the function definition. We would then write

print xy(initial_x_velocity, initial_y_velocity, time)

The two returned values from the function could alternatively have been assigned to variables, e.g., as

x_pos, y_pos = xy(initial_x_velocity, initial_y_velocity, time)

The variables x_pos and y_pos could then have been printed or used in other ways in the code.

There are possibilities for having a variable number of function input and output parameters (using *args and **kwargs constructions for the arguments). However, we do not go further into that topic here.

Variables that are defined inside a function, e.g., g in the last xy function, are local variables. This means they are only known inside the function. Therefore, if you had accidentally used g in some calculation outside the function, you would have got an error message. The variable time is defined outside the function and is therefore a global variable. It is known both outside and inside the function(s). If you define one global and one local variable, both with the same name, the function only sees the local one, so the global variable is not affected by what happens with the local variable of the same name.

The arguments named in the heading of a function definition are by rule local variables inside the function. If you want to change the value of a global variable inside a function, you need to declare the variable as global inside the function. That is, if the global variable was x, we would need to write global x inside the function definition before we let the function change it. After function execution, x would then have a changed value. One should strive to define variables mostly where they are needed and not everywhere.

Another very useful way of handling function parameters in Python, is by defining parameters as keyword arguments. This gives default values to parameters and allows more freedom in function calls, since the order and number of parameters may vary.

Let us illustrate the use of keyword arguments with the function xy. Assume we defined xy as

def xy(t, v0x=0, v0y=0):
    g = 9.81		              # acceleration of gravity
    return v0x*t, v0y*t - 0.5*g*t**2

Here, t is an ordinary or positional argument, whereas v0x and v0y are keyword arguments or named arguments. Generally, there can be many positional arguments and many keyword arguments, but the positional arguments must always be listed before the keyword arguments in function definition. Keyword arguments are given default values, as shown here with v0x and v0y, both having zero as default value. In a script, the function xy may now be called in many different ways. For example,

print xy(0.6)

would make xy perform the computations with t = 0.6 and the default values (i.e zero) of v0x and v0y. The two numbers returned from xy are printed to the screen. If we wanted to use another initial value for v0y, we could, e.g., write

print xy(0.6,v0y=4.0)

which would make xy perform the calculations with t = 0.6, v0x = 0 (i.e. the default value) and v0y = 4.0. When there are several positional arguments, they have to appear in the same order as defined in the function definition, unless we explicitly use the names of these also in the function call. With explicit name specification in the call, any order of parameters is acceptable. To illustrate, we could, e.g., call xy as

print xy(v0y=4.0, v0x=1.0, t=0.6)

In any programming language, it is a good habit to include a little explanation of what the function is doing, unless what is done by the function is obvious, e.g., when having only a few simple code lines. This explanation is called a doc string, which in Python should be placed just at the top of the function. This explanation is meant for a human who wants to understand the code, so it should say something about the purpose of the code and possibly explain the arguments and return values if needed. If we do that with our xy function from above, we may write the first lines of the function as

def xy(v0x, v0y, t):
    """Compute the x and y position of the ball at time t"""

Note that other functions may be called from within other functions, and function input parameters are not required to be numbers. Any object will do, e.g., string variables or other functions.

Functions are straightforwardly passed as arguments to other functions, as illustrated by the following script function_as_argument.py:

def sum_xy(x, y):
    return x + y

def prod_xy(x, y):
    return x*y

def treat_xy(f, x, y):
    return f(x, y)

x = 2;  y = 3
print treat_xy(sum_xy, x, y)
print treat_xy(prod_xy, x, y)

When run, this program first prints the sum of x and y (i.e., 5), and then it prints the product (i.e., 6). We see that treat_xy takes a function name as its first parameter. Inside treat_xy, that function is used to actually call the function that was given as input parameter. Therefore, as shown, we may call treat_xy with either sum_xy or prod_xy, depending on whether we want the sum or product of x and y to be calculated.

Functions may also be defined within other functions. It that case, they become local functions, or nested functions, known only to the function inside which they are defined. Functions defined in main are referred to as global functions. A nested function has full access to all variables in the parent function, i.e. the function within which it is defined.

Short functions can be defined in a compact way, using what is known as a lambda function:

f = lambda x, y: x + 2*y

# Equivalent
def f(x, y):
    return x + 2*y

The syntax consists of lambda followed by a series of arguments, colon, and some Python expression resulting in an object to be returned from the function. Lambda functions are particularly convenient as function arguments:

print treat_xy(lambda x, y: x*y, x, y)

Overhead of function calls. Function calls have the downside of slowing down program execution. Usually, it is a good thing to split a program into functions, but in very computing intensive parts, e.g., inside long loops, one must balance the convenience of calling a function and the computational efficiency of avoiding function calls. It is a good rule to develop a program using plenty of functions and then in a later optimization stage, when everything computes correctly, remove function calls that are quantified to slow down the code.

Here is a little example in IPython where we calculate the CPU time for doing array computations with and without a helper function:

In [1]: import numpy as np

In [2]: a = np.zeros(1000000)

In [3]: def add(a, b):
   ...:     return a + b
   ...:

In [4]: %timeit for i in range(len(a)): a[i] = add(i, i+1)
The slowest run took 16.01 times longer than the fastest.
This could mean that an intermediate result is being cached
1 loops, best of 3: 178 ms per loop

In [5]: %timeit for i in range(len(a)): a[i] = i + (i+1)
10 loops, best of 3: 109 ms per loop

We notice that there is some overhead in function calls. The impact of the overhead reduces quickly with the amount of computational work inside the function.