Programming for Computations - A Gentle Introduction to Numerical Simulations with Python

Functions are widely used in programming and is a concept that needs to be mastered. In the simplest case, a function in a program is much like a mathematical function: some input number \( x \) is transformed to some output number. One example is the \( \tanh^{-1}(x) \) function, called atan in computer code: it takes one real number as input and returns another number. Functions in Python are more general and can take a series of variables as input and return one or more variables, or simply nothing. The purpose of functions is two-fold:

When Python reads and interprets this program from the top, it takes the code from the line with def, to the line with return, to be the definition of a function with the name y (note colon and indentation). The return statement of the function y, i.e.

What actually happens when Python meets this code? The def line just tells Python that here is a function with name y and it has one argument t. Python does not look into the function at this stage (except that it checks the code for syntax errors). When Python later on meets the statement print y(time), it recognizes a function call y(time) and recalls that there is a function y defined with one argument. The value of time is then transferred to the y(t) function such that t = time becomes the first action in the y function. Then Python executes one line at a time in the y function. In the final line, the arithmetic expression v0*t - 0.5*g*t**2 is computed, resulting in a number, and this number (or more precisely, the Python object representing the number) replaces the call y(time) in the calling code such that the word print now precedes a number rather than a function call.

Python proceeds with the next line and sets time to a new value. The next print statement triggers a new call to y(t), this time t is set to 0.9, the computations are done line by line in the y function, and the returned result replaces y(time). Instead of writing print y(time), we could alternatively have stored the returned result from the y function in a variable,

Note that when a function contains if-elif-else constructions, return may be done from within any of the branches. This may be illustrated by the following function containing three return statements:

To return at the end or not.

Programmers disagree whether it is a good idea to use return inside a function where you want, or if there should only be one single return statement at the end of the function. The authors of this book emphasize readable code and think that return can be useful in branches as in the example above when the function is short. For longer or more complicated functions, it might be better to have one single return statement. Be prepared for critical comments if you return wherever you want...

An expression you will often encounter when dealing with programming, is main program, or that some code is in main. This is nothing particular to Python, and simply refers to that part of the program which is outside functions. However, note that the def line of functions is counted into main. So, in ball_function.py above, all statements outside the function y are in main, and also the line def y(t):.

A function may take no arguments, or many, in which case they are just listed within the parentheses (following the function name) and separated by a comma. Let us illustrate. Take a slight variation of the ball example and assume that the ball is not thrown straight up, but at an angle, so that two coordinates are needed to specify its position at any time. According to Newton's laws (when air resistance is negligible), the vertical position is given by \( y(t) = v_{0y}t - 0.5gt^2 \) and the horizontal position by \( x(t) = v_{0x}t \). We can include both these expressions in a new version of our program that prints the position of the ball for chosen times. Assume we want to evaluate these expressions at two points in time, \( t = 0.6s \) and \( t = 0.9s \). We can pick some numbers for the initial velocity components v0y and v0x, name the program ball_position.py, and write it for example as

Now we compute and print the two components for the position, for each of the two chosen points in time. Notice how each of the two functions now takes two arguments. Running the program gives the output

A function may also have no return value, in which case we simply drop the return statement, or it may return more than one value. For example, the two functions we just defined could alternatively have been written as one:

There are possibilities for having a variable number of function input and output parameters (using *args and **kwargs constructions for the arguments). However, we do not go further into that topic here.

Variables that are defined inside a function, e.g., g in the last xy function, are local variables. This means they are only known inside the function. Therefore, if you had accidentally used g in some calculation outside the function, you would have got an error message. The variable time is defined outside the function and is therefore a global variable. It is known both outside and inside the function(s). If you define one global and one local variable, both with the same name, the function only sees the local one, so the global variable is not affected by what happens with the local variable of the same name.

The arguments named in the heading of a function definition are by rule local variables inside the function. If you want to change the value of a global variable inside a function, you need to declare the variable as global inside the function. That is, if the global variable was x, we would need to write global x inside the function definition before we let the function change it. After function execution, x would then have a changed value. One should strive to define variables mostly where they are needed and not everywhere.

Another very useful way of handling function parameters in Python, is by defining parameters as keyword arguments. This gives default values to parameters and allows more freedom in function calls, since the order and number of parameters may vary.

Let us illustrate the use of keyword arguments with the function xy. Assume we defined xy as

In any programming language, it is a good habit to include a little explanation of what the function is doing, unless what is done by the function is obvious, e.g., when having only a few simple code lines. This explanation is called a doc string, which in Python should be placed just at the top of the function. This explanation is meant for a human who wants to understand the code, so it should say something about the purpose of the code and possibly explain the arguments and return values if needed. If we do that with our xy function from above, we may write the first lines of the function as

Note that other functions may be called from within other functions, and function input parameters are not required to be numbers. Any object will do, e.g., string variables or other functions.

Functions are straightforwardly passed as arguments to other functions, as illustrated by the following script function_as_argument.py:

When run, this program first prints the sum of x and y (i.e., 5), and then it prints the product (i.e., 6). We see that treat_xy takes a function name as its first parameter. Inside treat_xy, that function is used to actually call the function that was given as input parameter. Therefore, as shown, we may call treat_xy with either sum_xy or prod_xy, depending on whether we want the sum or product of x and y to be calculated.

Functions may also be defined within other functions. It that case, they become local functions, or nested functions, known only to the function inside which they are defined. Functions defined in main are referred to as global functions. A nested function has full access to all variables in the parent function, i.e. the function within which it is defined.

Short functions can be defined in a compact way, using what is known as a lambda function:

Overhead of function calls.

Function calls have the downside of slowing down program execution. Usually, it is a good thing to split a program into functions, but in very computing intensive parts, e.g., inside long loops, one must balance the convenience of calling a function and the computational efficiency of avoiding function calls. It is a good rule to develop a program using plenty of functions and then in a later optimization stage, when everything computes correctly, remove function calls that are quantified to slow down the code.

Here is a little example in IPython where we calculate the CPU time for doing array computations with and without a helper function:

In [1]: import numpy as np

In [2]: a = np.zeros(1000000)

In [3]: def add(a, b):
   ...:     return a + b
   ...:

In [4]: %timeit for i in range(len(a)): a[i] = add(i, i+1)
The slowest run took 16.01 times longer than the fastest.
This could mean that an intermediate result is being cached
1 loops, best of 3: 178 ms per loop

In [5]: %timeit for i in range(len(a)): a[i] = i + (i+1)
10 loops, best of 3: 109 ms per loop

We notice that there is some overhead in function calls. The impact of the overhead reduces quickly with the amount of computational work inside the function.