This chapter is taken from the book A Primer on Scientific Programming with Python by H. P. Langtangen, 5th edition, Springer, 2016.
So far we have seen that vectorization of a Python function f(x)
implementing some mathematical function \( f(x) \) seems trivial: f(x)
works right away with an array argument x
and, in that case, returns
an array where \( f \) is applied to each element in x
. When the
expression for \( f(x) \) is given in terms of a string and the
StringFunction
tool is used to generate the corresponding Python
function f(x)
, one extra step must be performed to vectorize the
Python function. This step is explained in the section Vectorization of StringFunction objects.
The described vectorization works well as long as the expression
\( f(x) \) is a mathematical formula without any if
test. As soon as we
have if
tests (conditional mathematical expressions) the
vectorization becomes more challenging. Some useful techniques are
explained through two examples in the sections Vectorization of the Heaviside function and
Vectorization of a hat function. The described techniques are considered
advanced material and only necessary when the time spent on evaluating
a function at a very large set of points needs to be significantly
decreased.
The StringFunction
object from scitools.std
can convert a formula,
available as a string, to a callable Python function (see the
document User input and error handling
[3]). However, the function cannot work
with array arguments unless we explicitly tell the StringFunction
object to do so. The recipe is very simple. Say f
is some
StringFunction
object. To allow array arguments we are required to
call f.vectorize(globals())
once:
from numpy import *
x = linspace(0, 1, 30)
# f(x) will in general not work
f.vectorize(globals())
values = f(x) # f works with array arguments
It is important that you import everything from numpy
(or
scitools.std
) before calling f.vectorize
, exactly as shown.
You may take the f.vectorize
call as a magic recipe. Still, some
readers want to know what problem f.vectorize
solves. Inside the
StringFunction
module we need to have access to mathematical
functions for expressions like sin(x)*exp(x)
to be evaluated. These
mathematical functions are by default taken from the math
module and
hence they do not work with array arguments. If the user, in the main
program, has imported mathematical functions that work with array
arguments, these functions are registered in a dictionary returned
from globals()
. By the f.vectorize
call we supply the
StringFunction
module with the user's global namespace so that the
evaluation of the string expression can make use of the mathematical
functions for arrays from the user's program. Unless you use
np.sin(x)*np.cos(x)
etc. in the string formulas, make sure you do a
from numpy import *
so that the function names are defined without
any prefix.
Even after calling f.vectorize(globals())
, a StringFunction
object
may face problems with vectorization. One example is a piecewise
constant function as specified by a string expression '1 if x > 2
else 0'
. The section Vectorization of the Heaviside function explains why if
tests fail
for arrays and what the remedies are.
We consider the widely used Heaviside function defined by $$ \begin{equation*} H(x) = \left\lbrace\begin{array}{ll} 0, & x < 0\\ 1, & x\geq 0 \end{array}\right. \end{equation*} $$ The most compact way if implementing this function is
def H(x):
return (0 if x < 0 else 1)
Trying to call H(x)
with an array argument x
fails:
>>> def H(x): return (0 if x < 0 else 1)
...
>>> import numpy as np
>>> x = np.linspace(-10, 10, 5)
>>> x
array([-10., -5., 0., 5., 10.])
>>> H(x)
...
ValueError: The truth value of an array with more than
one element is ambiguous. Use a.any() or a.all()
The problem is related to the test x < 0
, which results
in an array of boolean values, while the if
test needs a
single boolean value (essentially taking bool(x < 0)
):
>>> b = x < 0
>>> b
array([ True, True, False, False, False], dtype=bool)
>>> bool(b) # evaluate b in a boolean context
...
ValueError: The truth value of an array with more than
one element is ambiguous. Use a.any() or a.all()
>>> b.any() # True if any element in b is True
True
>>> b.all() # True if all elements in b are True
False
The any
and all
calls do not help us since we
want to take actions element by element depending on whether
x[i] < 0
or not.
There are four ways to find a remedy to our problems with the if x <
0
test: (i) we can write an explicit loop for computing the elements,
(ii) we can use a tool for automatically vectorize H(x)
, (iii) we
can mix boolean and floating-point calculations, or (iv) we can
manually vectorize the H(x)
function. All four methods will be
illustrated next.
The following function works well for arrays if we insert
a simple loop over the array elements (such that H(x)
operates
on scalars only):
def H_loop(x):
r = np.zeros(len(x))
for i in xrange(len(x)):
r[i] = H(x[i])
return r
# Example:
x = np.linspace(-5, 5, 6)
y = H_loop(x)
Numerical Python contains a method for automatically vectorizing
a Python function H(x)
that works with scalars
(pure numbers) as x
argument:
import numpy as np
H_vec = np.vectorize(H)
The H_vec(x)
function will now work with vector/array arguments
x
. Unfortunately, such automatically vectorized functions runs
at a fairly slow speed compared to the implementations below
(see the end of the section Vectorization of a hat function for specific timings).
It appears that a very simple solution to vectorizing the H(x)
function is to implement it as
def H(x):
return x >= 0
The return value is now a bool
object, not an int
or float
as we
would mathematically expect to be the proper type of the
result. However, the bool
object works well in both scalar and
vectorized operations as long as we involve the returned H(x)
in
some arithmetic expression. The True
and False
values are then
interpreted as 1 and 0. Here is a demonstration:
>>> x = np.linspace(-1, 1, 5)
>>> H(x)
array([False, False, True, True, True], dtype=bool)
>>> 1*H(x)
array([0, 0, 1, 1, 1])
>>> H(x) - 2
array([-2, -2, -1, -1, -1])
>>>
>>> x = 0.2 # test scalar argument
>>> H(x)
True
>>> 1*H(x)
1
>>> H(x) - 2
-1
If returning a boolean value is considered undesirable, we can turn
the bool
object into the proper type by
def H(x):
r = x >= 0
if isinstance(x, (int,float)):
return int(r)
elif isinstance(x, np.ndarray):
return np.asarray(r, dtype=np.int)
By manual vectorization we normally mean translating the algorithm
into a set of calls to functions in the numpy
package such that no
loops are visible in the Python code. The last version of the H(x)
is a manual vectorization, but now we shall look at a more general
technique when the result is not necessarily 0 or 1. In general,
manual vectorization is non-trivial and requires knowledge of and
experience with the underlying library for array computations.
Fortunately, there is a simple numpy
recipe for turning functions of
the form
def f(x):
if condition:
r = <expression1>
else:
r = <expression2>
return r
into vectorized form:
def f_vectorized(x):
x1 = <expression1>
x2 = <expression2>
r = np.where(condition, x1, x2)
return r
The np.where
function returns an array of the same length as
condition
, whose % (for one-dimensional arrays x1
and x2
)
element no. i
equals x1[i]
if condition[i]
is True
, and
x2[i]
otherwise. With Python loops we can express this principle as
def my_where(condition, x1, x2):
r = np.zeros(len(condition)) # result
for i in xrange(condition):
r[i] = x1[i] if condition[i] else x2[i]
return r
The x1
and x2
variables can be pure numbers too in the call
to np.where
.
In our case we can use the np.where
function as follows:
def Hv(x):
return np.where(x < 0, 0.0, 1.0)
Instead of using np.where
we can apply boolean indexing. The idea
is that an array a
allows to be indexed by an array b
of boolean
values: a[b]
. The result a[b]
is a new array with all the
elements a[i]
where b[i]
is True
:
>>> a
array([ 0. , 2.5, 5. , 7.5, 10. ])
>>> b = a > 5
>>> b
array([False, False, False, True, True], dtype=bool)
>>> a[b]
array([ 7.5, 10. ])
We can assign new values to the elements in a
where b
is True
:
>>> a[b]
array([ 7.5, 10. ])
>>> a[b] = np.array([-10, -20], dtype=np.float)
>>> a
array([ 0. , 2.5, 5. , -10. , -20. ])
>>> a[b] = -4
>>> a
array([ 0. , 2.5, 5. , -4. , -4. ])
To implement the Heaviside function, we start with an array of zeros
and then assign 1 to the elements where x >= 0
:
def Hv(x):
r = np.zeros(len(x), dtype=np.int)
r[x >= 0] = 1
return r
We now turn the attention to the hat function \( N(x) \) defined by
$$
\begin{equation*}
N(x) = \left\lbrace\begin{array}{ll}
0, & x < 0\\
x, & 0\leq x < 1\\
2-x, & 1\leq x < 2\\
0, & x \geq 2
\end{array}\right.
\end{equation*}
$$
The corresponding Python implementation N(x)
is
def N(x):
if x < 0:
return 0.0
elif 0 <= x < 1:
return x
elif 1 <= x < 2:
return 2 - x
elif x >= 2:
return 0.0
Unfortunately, this N(x)
function does not work with array arguments
x
, because the boolean expressions, like x < 0
, are arrays and
they cannot yield a single True
or False
value for the if
tests,
as explained in the section Vectorization of the Heaviside function.
The simplest remedy is to use np.vectorize
from the section Vectorization of the Heaviside function:
N_vec = np.vectorize(N)
It is then important that N(x)
returns float
and not int
values,
otherwise the vectorized version will produce int
values and hence
be incorrect.
A manual rewrite, yielding a faster vectorized function, is more
demanding than for the Heaviside function because we now have multiple
branches in the if
test. One sketch is to replace
if condition1:
r = <expression1>
elif condition2:
r = <expression2>
elif condition3:
r = <expression3>
else:
r = <expression4>
by
x1 = <expression1>
x2 = <expression2>
x3 = <expression3>
x4 = <expression4>
r = np.where(condition1, x1, x4) # initialize with "else" expr.
r = np.where(condition2, x2, r)
r = np.where(condition3, x3, r)
Alternatively, we can use boolean indexing. Assuming that
<expressionX>
is some expression involving an array x
and coded as
a Python function fX(x)
(X
is 1
, 2
, 3
, or 4
), we can write
r = f4(x)
r[condition1] = f1(x[condition1])
r[condition2] = f2(x[condition2])
r[condition3] = f2(x[condition3])
Specifically, when the function for scalar arguments x
reads
def N(x):
if x < 0:
return 0.0
elif 0 <= x < 1:
return x
elif 1 <= x < 2:
return 2 - x
elif x >= 2:
return 0.0
a vectorized attempt would be
def Nv(x):
r = np.where(x < 0, 0.0, 0.0)
r = np.where(0 <= x < 1, x, r)
r = np.where(1 <= x < 2, 2-x, r)
r = np.where(x >= 2, 0.0, r)
return r
The first and last line are not strictly necessary as we could just start with a zero vector (making the insertion of zeros for the first and last condition a redundant operation).
However, any condition like 0 <= x < 1
, which is equivalent to 0 <=
x and x < 1
, does not work because the and
operator does not work
with array arguments. Fortunately, there is a simple solution to this
problem: the function logical_and
in numpy
. A working Nv
function must apply logical_and
instead in each condition:
def Nv1(x):
condition1 = x < 0
condition2 = np.logical_and(0 <= x, x < 1)
condition3 = np.logical_and(1 <= x, x < 2)
condition4 = x >= 2
r = np.where(condition1, 0.0, 0.0)
r = np.where(condition2, x, r)
r = np.where(condition3, 2-x, r)
r = np.where(condition4, 0.0, r)
return r
With boolean indexing we get the alternative form
def Nv2(x):
condition1 = x < 0
condition2 = np.logical_and(0 <= x, x < 1)
condition3 = np.logical_and(1 <= x, x < 2)
condition4 = x >= 2
r = np.zeros(len(x))
r[condition1] = 0.0
r[condition2] = x[condition2]
r[condition3] = 2-x[condition3]
r[condition4] = 0.0
return r
Again, the first and last assignment to r
can be omitted in this
special case where we start with a zero vector.
The file hat.py
implements four vectorized versions of the N(x)
function: N_loop
, which is a plain loop calling up N(x)
for each
x[i]
element in the array x
; N_vec
, which is the result of
automatic vectorization via np.vectorize
; the Nv1
function shown
above, which uses the np.where
constructions; and the Nv2
function,
which uses boolean indexing. With a length of x
of 1,000,000, the
results on my computer (MacBook Air 11'', 2 1.6GHz Intel CPU, running
Ubuntu 12.04 in a VMWare virtual machine) became 4.8 s for N_loop
, 1
s N_vec
, 0.3 s for Nv1
, and 0.08 s for Nv2
. Boolean indexing is
clearly the fastest method.