This chapter is taken from the book A Primer on Scientific Programming with Python by H. P. Langtangen, 5th edition, Springer, 2016.

More advanced vectorization of functions

So far we have seen that vectorization of a Python function f(x) implementing some mathematical function $ f(x) $ seems trivial: f(x) works right away with an array argument x and, in that case, returns an array where $ f $ is applied to each element in x. When the expression for $ f(x) $ is given in terms of a string and the StringFunction tool is used to generate the corresponding Python function f(x), one extra step must be performed to vectorize the Python function. This step is explained in the section Vectorization of StringFunction objects.

The described vectorization works well as long as the expression $ f(x) $ is a mathematical formula without any if test. As soon as we have if tests (conditional mathematical expressions) the vectorization becomes more challenging. Some useful techniques are explained through two examples in the sections Vectorization of the Heaviside function and Vectorization of a hat function. The described techniques are considered advanced material and only necessary when the time spent on evaluating a function at a very large set of points needs to be significantly decreased.

Vectorization of StringFunction objects

The StringFunction object from scitools.std can convert a formula, available as a string, to a callable Python function (see the document User input and error handling [3]). However, the function cannot work with array arguments unless we explicitly tell the StringFunction object to do so. The recipe is very simple. Say f is some StringFunction object. To allow array arguments we are required to call f.vectorize(globals()) once:

from numpy import *
x = linspace(0, 1, 30)
# f(x) will in general not work

f.vectorize(globals())
values = f(x)            # f works with array arguments

It is important that you import everything from numpy (or scitools.std) before calling f.vectorize, exactly as shown.

You may take the f.vectorize call as a magic recipe. Still, some readers want to know what problem f.vectorize solves. Inside the StringFunction module we need to have access to mathematical functions for expressions like sin(x)*exp(x) to be evaluated. These mathematical functions are by default taken from the math module and hence they do not work with array arguments. If the user, in the main program, has imported mathematical functions that work with array arguments, these functions are registered in a dictionary returned from globals(). By the f.vectorize call we supply the StringFunction module with the user's global namespace so that the evaluation of the string expression can make use of the mathematical functions for arrays from the user's program. Unless you use np.sin(x)*np.cos(x) etc. in the string formulas, make sure you do a from numpy import * so that the function names are defined without any prefix.

Even after calling f.vectorize(globals()), a StringFunction object may face problems with vectorization. One example is a piecewise constant function as specified by a string expression '1 if x > 2 else 0'. The section Vectorization of the Heaviside function explains why if tests fail for arrays and what the remedies are.

Vectorization of the Heaviside function

We consider the widely used Heaviside function defined by $$ \begin{equation*} H(x) = \left\lbrace\begin{array}{ll} 0, & x < 0\\ 1, & x\geq 0 \end{array}\right. \end{equation*} $$ The most compact way if implementing this function is

def H(x):
    return (0 if x < 0 else 1)

Trying to call H(x) with an array argument x fails:

>>> def H(x): return (0 if x < 0 else 1)
...
>>> import numpy as np
>>> x = np.linspace(-10, 10, 5)
>>> x
array([-10.,  -5.,   0.,   5.,  10.])
>>> H(x)
...
ValueError: The truth value of an array with more than
one element is ambiguous. Use a.any() or a.all()

The problem is related to the test x < 0, which results in an array of boolean values, while the if test needs a single boolean value (essentially taking bool(x < 0)):

>>> b = x < 0
>>> b
array([ True,  True, False, False, False], dtype=bool)
>>> bool(b)  # evaluate b in a boolean context
...
ValueError: The truth value of an array with more than
one element is ambiguous. Use a.any() or a.all()
>>> b.any()  # True if any element in b is True
True
>>> b.all()  # True if all elements in b are True
False

The any and all calls do not help us since we want to take actions element by element depending on whether x[i] < 0 or not.

There are four ways to find a remedy to our problems with the if x < 0 test: (i) we can write an explicit loop for computing the elements, (ii) we can use a tool for automatically vectorize H(x), (iii) we can mix boolean and floating-point calculations, or (iv) we can manually vectorize the H(x) function. All four methods will be illustrated next.

Loop

The following function works well for arrays if we insert a simple loop over the array elements (such that H(x) operates on scalars only):

def H_loop(x):
    r = np.zeros(len(x))
    for i in xrange(len(x)):
        r[i] = H(x[i])
    return r

# Example:
x = np.linspace(-5, 5, 6)
y = H_loop(x)

Automatic vectorization

Numerical Python contains a method for automatically vectorizing a Python function H(x) that works with scalars (pure numbers) as x argument:

import numpy as np
H_vec = np.vectorize(H)

The H_vec(x) function will now work with vector/array arguments x. Unfortunately, such automatically vectorized functions runs at a fairly slow speed compared to the implementations below (see the end of the section Vectorization of a hat function for specific timings).

Mixing boolean and floating-point calculations

It appears that a very simple solution to vectorizing the H(x) function is to implement it as

def H(x):
    return x >= 0

The return value is now a bool object, not an int or float as we would mathematically expect to be the proper type of the result. However, the bool object works well in both scalar and vectorized operations as long as we involve the returned H(x) in some arithmetic expression. The True and False values are then interpreted as 1 and 0. Here is a demonstration:

>>> x = np.linspace(-1, 1, 5)
>>> H(x)
array([False, False,  True,  True,  True], dtype=bool)
>>> 1*H(x)
array([0, 0, 1, 1, 1])
>>> H(x) - 2
array([-2, -2, -1, -1, -1])
>>>
>>> x = 0.2   # test scalar argument
>>> H(x)
True
>>> 1*H(x)
1
>>> H(x) - 2
-1

If returning a boolean value is considered undesirable, we can turn the bool object into the proper type by

def H(x):
    r = x >= 0
    if isinstance(x, (int,float)):
        return int(r)
    elif isinstance(x, np.ndarray):
        return np.asarray(r, dtype=np.int)

Manual vectorization

By manual vectorization we normally mean translating the algorithm into a set of calls to functions in the numpy package such that no loops are visible in the Python code. The last version of the H(x) is a manual vectorization, but now we shall look at a more general technique when the result is not necessarily 0 or 1. In general, manual vectorization is non-trivial and requires knowledge of and experience with the underlying library for array computations. Fortunately, there is a simple numpy recipe for turning functions of the form

def f(x):
    if condition:
        r = <expression1>
    else:
        r = <expression2>
    return r

into vectorized form:

def f_vectorized(x):
    x1 = <expression1>
    x2 = <expression2>
    r = np.where(condition, x1, x2)
    return r

The np.where function returns an array of the same length as condition, whose % (for one-dimensional arrays x1 and x2) element no. i equals x1[i] if condition[i] is True, and x2[i] otherwise. With Python loops we can express this principle as

def my_where(condition, x1, x2):
    r = np.zeros(len(condition))   # result
    for i in xrange(condition):
        r[i] = x1[i] if condition[i] else x2[i]
    return r

The x1 and x2 variables can be pure numbers too in the call to np.where.

In our case we can use the np.where function as follows:

def Hv(x):
    return np.where(x < 0, 0.0, 1.0)

Instead of using np.where we can apply boolean indexing. The idea is that an array a allows to be indexed by an array b of boolean values: a[b]. The result a[b] is a new array with all the elements a[i] where b[i] is True:

>>> a
array([  0. ,   2.5,   5. ,   7.5,  10. ])
>>> b = a > 5
>>> b
array([False, False, False,  True,  True], dtype=bool)
>>> a[b]
array([  7.5,  10. ])

We can assign new values to the elements in a where b is True:

>>> a[b]
array([  7.5,  10. ])
>>> a[b] = np.array([-10, -20], dtype=np.float)
>>> a
array([  0. ,   2.5,   5. , -10. , -20. ])
>>> a[b] = -4
>>> a
array([ 0. ,  2.5,  5. , -4. , -4. ])

To implement the Heaviside function, we start with an array of zeros and then assign 1 to the elements where x >= 0:

def Hv(x):
    r = np.zeros(len(x), dtype=np.int)
    r[x >= 0] = 1
    return r

Vectorization of a hat function

We now turn the attention to the hat function $ N(x) $ defined by $$ \begin{equation*} N(x) = \left\lbrace\begin{array}{ll} 0, & x < 0\\ x, & 0\leq x < 1\\ 2-x, & 1\leq x < 2\\ 0, & x \geq 2 \end{array}\right. \end{equation*} $$ The corresponding Python implementation N(x) is

def N(x):
    if x < 0:
        return 0.0
    elif 0 <= x < 1:
        return x
    elif 1 <= x < 2:
        return 2 - x
    elif x >= 2:
        return 0.0

Unfortunately, this N(x) function does not work with array arguments x, because the boolean expressions, like x < 0, are arrays and they cannot yield a single True or False value for the if tests, as explained in the section Vectorization of the Heaviside function.

The simplest remedy is to use np.vectorize from the section Vectorization of the Heaviside function:

N_vec = np.vectorize(N)

It is then important that N(x) returns float and not int values, otherwise the vectorized version will produce int values and hence be incorrect.

A manual rewrite, yielding a faster vectorized function, is more demanding than for the Heaviside function because we now have multiple branches in the if test. One sketch is to replace

if condition1:
   r = <expression1>
elif condition2:
   r = <expression2>
elif condition3:
   r = <expression3>
else:
   r = <expression4>

x1 = <expression1>
x2 = <expression2>
x3 = <expression3>
x4 = <expression4>
r = np.where(condition1, x1, x4)  # initialize with "else" expr.
r = np.where(condition2, x2, r)
r = np.where(condition3, x3, r)

Alternatively, we can use boolean indexing. Assuming that <expressionX> is some expression involving an array x and coded as a Python function fX(x) (X is 1, 2, 3, or 4), we can write

r = f4(x)
r[condition1] = f1(x[condition1])
r[condition2] = f2(x[condition2])
r[condition3] = f2(x[condition3])

Specifically, when the function for scalar arguments x reads

def N(x):
    if x < 0:
        return 0.0
    elif 0 <= x < 1:
        return x
    elif 1 <= x < 2:
        return 2 - x
    elif x >= 2:
        return 0.0

a vectorized attempt would be

def Nv(x):
    r = np.where(x < 0,      0.0,  0.0)
    r = np.where(0 <= x < 1, x,    r)
    r = np.where(1 <= x < 2, 2-x,  r)
    r = np.where(x >= 2,     0.0,  r)
    return r

The first and last line are not strictly necessary as we could just start with a zero vector (making the insertion of zeros for the first and last condition a redundant operation).

However, any condition like 0 <= x < 1, which is equivalent to 0 <= x and x < 1, does not work because the and operator does not work with array arguments. Fortunately, there is a simple solution to this problem: the function logical_and in numpy. A working Nv function must apply logical_and instead in each condition:

def Nv1(x):
    condition1 = x < 0
    condition2 = np.logical_and(0 <= x, x < 1)
    condition3 = np.logical_and(1 <= x, x < 2)
    condition4 = x >= 2

    r = np.where(condition1, 0.0, 0.0)
    r = np.where(condition2, x,   r)
    r = np.where(condition3, 2-x, r)
    r = np.where(condition4, 0.0, r)
    return r

With boolean indexing we get the alternative form

def Nv2(x):
    condition1 = x < 0
    condition2 = np.logical_and(0 <= x, x < 1)
    condition3 = np.logical_and(1 <= x, x < 2)
    condition4 = x >= 2

    r = np.zeros(len(x))
    r[condition1] = 0.0
    r[condition2] = x[condition2]
    r[condition3] = 2-x[condition3]
    r[condition4] = 0.0
    return r

Again, the first and last assignment to r can be omitted in this special case where we start with a zero vector.

The file hat.py implements four vectorized versions of the N(x) function: N_loop, which is a plain loop calling up N(x) for each x[i] element in the array x; N_vec, which is the result of automatic vectorization via np.vectorize; the Nv1 function shown above, which uses the np.where constructions; and the Nv2 function, which uses boolean indexing. With a length of x of 1,000,000, the results on my computer (MacBook Air 11'', 2 1.6GHz Intel CPU, running Ubuntu 12.04 in a VMWare virtual machine) became 4.8 s for N_loop, 1 s N_vec, 0.3 s for Nv1, and 0.08 s for Nv2. Boolean indexing is clearly the fastest method.