This chapter is taken from the book A Primer on Scientific Programming with Python by H. P. Langtangen, 5th edition, Springer, 2016.
Make a program that simulates flipping a coin \( N \) times.
Print out tail
or head
for each flip and
let the program count and print the number of heads.
Use r = random.random()
and define head as r <= 0.5
,
or draw an integer among \( \{0,1\} \) with r = random.randint(0,1)
and define head when r
is 0.
Filename: flip_coin
.
What is the probability of getting a number between 0.5 and 0.6 when
drawing uniformly distributed random numbers from the interval \( [0,1) \)?
To answer this question empirically, let a program
draw \( N \) such random numbers using Python's standard random
module,
count how many of them, \( M \), that fall in the interval \( [0.5,0.6] \), and
compute the probability as \( M/N \). Run the program with the four
values \( N=10^{i} \) for \( i=1,2,3,6 \).
Filename: compute_prob
.
Suppose we have eight different colors. Make a program that chooses one of these colors at random and writes out the color.
Use a list of color names and use the choice
function in
the random
module to pick a list element.
Filename: choose_color
.
Suppose there are 40 balls in a hat, of which 10 are red,
10 are blue, 10 are yellow, and 10 are purple.
What is the probability of getting two blue and two purple balls
when drawing 10 balls at random from the hat?
Filename: draw_10balls
.
This exercise deals with four questions:
rolling_dice
.
Make a program for estimating the probability of getting at least one die with six eyes when throwing \( n \) dice. Read \( n \) and the number of experiments from the command line.
As a partial verification, compare the Monte Carlo simulation results
to the exact answer 11/36 for \( n=2 \) and observe that the approximate
probabilities approach the exact probability as the number of
simulations grow.
Filename: one6_ndice
.
Use the Deck.py module
(see the section Example: Class implementation of a deck) and the same_rank
and same_suit
functions from the cards module (see the section Example: Drawing cards from a deck) to compute the following
probabilities by Monte Carlo simulation:
card_hands
.
Somebody suggests the following game. You pay 1 euro and are allowed
to throw four dice. If the sum of the eyes on the dice is less than
9, you get paid \( r \) euros back, otherwise you lose the 1 euro
investment. Assume \( r=10 \). Will you then in the long run win or lose
money by playing this game? Answer the question by making a program
that simulates the game. Read \( r \) and the number of experiments \( N \)
from the command line.
Filename: sum_4dice
.
It turns out that the game in Exercise 8: Decide if a dice game is fair is not fair, since you lose money in the long run. The purpose of this exercise is to adjust the winning award so that the game becomes fair, i.e., that you neither lose nor win money in the long run.
Make a Python function that computes the probability \( p \) of getting a sum less
than \( s \) when rolling \( n \) dice. Use the reasoning in
the section Example: Throwing dice to find
the award per game, \( r \), that makes the game fair. Run the
program from Exercise 8: Decide if a dice game is fair with this \( r \) on
the command line and verify that the game is now (approximately) fair.
Filename: sum_ndice_fair
.
We consider the Python function in Exercise 9: Adjust a game to make it fair for computing a probability \( p \) that the sum of the eyes on \( n \) dice is less than \( s \). The aim is to write a test function for verifying the computation of \( p \).
a) Find some combinations of \( n \) and \( s \) that must result in \( p=0 \) and \( p=1 \) and make the appropriate code in the test function.
b) Fix the seed of the random number generator and record the first eight random numbers to 16 digits. Set \( n=2 \), perform four experiments, and compute by hand what the probability estimate becomes (choose any appropriate \( s \)). Write the necessary code in the test function to compare this manually calculated result with the what is produced by the function from Exercise 9: Adjust a game to make it fair.
Filename: test_sum_ndice
.
Consider the game in the section Example: Throwing dice. A generalization is to think as follows: you throw one die until the number of eyes is less than or equal to the previous throw. Let \( m \) be the number of throws in a game.
a) Use Monte Carlo simulation to compute the probability of getting \( m=2,3,4,\ldots \).
For \( m\geq 6 \) the throws must be exactly \( 1,2,3,4,5,6,6,6,\ldots \), and the probability of each is 1/6, giving the total probability \( 6^{-m} \). Use \( N=10^6 \) experiments as this should suffice to estimate the probabilities for \( m\leq 5 \), and beyond that we have the analytical expression.
b) If you pay 1 euro to play this game, what is the fair amount to get paid when win? Answer this question for each of the cases \( m=2,3,4,5 \).
Filename: incr_eyes
.
Suggest a player strategy for the game in the section Rolling two dice. Remove the question in the
player_guess
function in the file ndice2.py, and implement the chosen strategy
instead. Let the program play a large number of games, and record the
number of times the computer wins. Which strategy is best in the long
run: the computer's or yours?
Filename: simulate_strategies1
.
Extend the program from Exercise 12: Compare two playing strategies such that
the computer and the player can use a different number of dice.
Let the computer choose a random number of dice between 2 and 20.
Experiment to find out if there is a favorable number of dice for
the player.
Filename: simulate_strategies2
.
An amusement park offers the following game. A hat contains 20 balls: 5 red, 5 yellow, 3 green, and 7 brown. At a cost of \( 2n \) euros you can draw \( 4 \leq n\leq 10 \) balls at random from the hat (without putting them back). Before you are allowed to look at the drawn balls, you must choose one of the following options:
draw_balls
.
Throw two dice a large number of times in a program.
Record the sum of the eyes each time and count how many times
each of the possibilities for the sum (2, 3, \( \ldots \), 12) appear.
Compute the corresponding probabilities and compare them with
the exact values.
(To find the exact
probabilities, set up all the \( 6\times 6 \) possible outcomes of throwing
two dice, and then count how many of them that has a sum \( s \) for
\( s=2,3,\ldots,12 \).)
Filename: freq_2dice
.
Simulate flipping a coin \( N \) times and write out the number of tails. The code should be vectorized, i.e., there must be no loops in Python.
Constructions like numpy.where(r<=0.5, 1, 0)
combined with numpy.sum
, or r[r<=0.5].size
, are useful,
where r
is an array of random numbers between 0 and 1.
Filename: flip_coin_vec
.
The purpose of this exercise is to speed up the code in Exercise 2: Compute a probability by vectorization.
For an array r
of uniformly distributed random numbers on \( [0,1) \),
make use of r1 = r[r>0.5]
and r1[r1<0.6]
. An alternative is
numpy.where
combine with a compound boolean expression with
numpy.logical_and(0.5>=r, r<=0.6)
.
See the discussion of this topic in
the document Array computing and curve plotting [4].
Filename: compute_prob_vec
.
Use Monte Carlo simulation to compute the probability of getting 6 eyes on all dice when rolling \( 7 \) dice.
You need a large number of experiments in this case because of the small probability (see the first paragraph of the section Computing probabilities), so a vectorized implementation may be important.
Filename: roll_7dice
.
A democracy takes decisions based on majority votes. We shall investigate if this is a good idea or if a single person would produce better decisions.
We shall ask about pure facts, not opinions. This means that the question to be answered by a population has a definite "yes" or "no" answer. For example, "Can Python lists contain tuples as elements?" The correct answer is "yes". Asking a population such a question and relying on the majority of votes, is a reliable procedure if the competence level in the population is sufficiently high.
a) Assume that the competence level in a population can be modeled by a probability \( p \) such that if you ask \( N \) people a question, \( M=pN \) of them will give the correct answer (as \( N\rightarrow\infty \)). Here we make the questionable assumption of a homogeneous population, in the sense that \( p \) is the same for every individual.
Make a function homogeneous(p, N)
for simulating whether the
majority vote of a population of \( N \) individuals arrives at the
right answer, if the probability of answering correctly is \( p \) for
an individual. Make another function homogeneous_ex()
that
runs 10 tests the specific case of \( N=5 \) (as when
relying on the majority of a student group) and 10 tests when
asking a whole city of \( N=1,000,000 \) voters. Try \( p=0.49 \), \( p=0.51 \),
and \( p=0.8 \). Are the results as you would expect from intuition?
Asking one individual is like flipping a biased coin that has probability \( p \) of giving head (right answer) and probability \( 1-p \) of giving tail (wrong answer).
b) The problem in a) can be exactly solved, since each question is a Bernoulli trial with success probability \( p \), and the probability of a correct majority vote is the same as the probability of getting \( N/2 \) or more successes in \( N \) trials. For large \( N \), the probability of \( M \) successes in \( N \) trials can be well approximated by a normal (Gaussian) density function: $$ g(M) = (\sqrt{2\pi} Np(1-p))^{-1}\exp{(-\frac{1}{2}(M-Np)^2/(Np(1-p)))}\tp $$ The majority vote is correct when \( M>N/2 \), and the probability of this event is given by \( 1-\Phi(N/2) \), where \( \Phi \) is the cumulative normal distribution with mean \( Np \) and variance \( Np(1-p) \).
Plot the probability of being right against \( p \).
Say 5 questions are of importance. What competence level \( p \) does a king need to have all 5 right compared to the population having all 5 right.
c) We shall now simulate voting in a heterogeneous population. The probability that an individual no. \( i \) answers correctly is \( p_i \), where \( p_i \) is drawn from a normal (Gaussian) probability density with mean \( p \) and standard deviation \( s \). The competence level varies between individuals, with \( s \) expressing the spreading of knowledge and \( p \) the mean competence level.
Make function heterogeneous(p, N, s)
for returning whether the
majority vote is right or wrong in the heterogeneous case.
Rerun the examples from a) with \( s=0.2 \).
d) With a somewhat large variation of the population, i.e., \( s \) somewhat large, there will be some individuals that always provide wrong or right answers according to this model. To learn about reasonable values \( s \) we can investigate unreasonable large amounts of people who are always right or wrong.
The probability of always being wrong is the probability of \( p_i < 0 \).
This is given by \( \Phi(-p/s) \), where \( \Phi \) is the cumulative
normal distribution with mean zero and unit standard deviation.
It can be reached in Python as scipy.stats.norm.cdf
.
The probability of always being right is the probability of
\( p_i>1 \), which can be computed as \( 1-\Phi((1-p)/s) \).
Plot curves of the probability of always being right and always
wrong against \( s\in [0.1,0.6] \). Perform this curve plotting in
a function extremes(p)
.
Filename: democracy
.
Simple random number generators are based on simulating difference
equations. Here is a typical set of two equations:
$$
\begin{align}
x_n &= (ax_{n-1} + c) \hbox{ mod } m,
\tag{11}\\
y_n &= x_n/m,
\tag{12}
\end{align}
$$
for \( n=1,2,\ldots \).
A seed \( x_0 \) must be given to start the sequence.
The numbers \( y_1,y_2,\ldots \) represent the random numbers and
\( x_0,x_1,\ldots \) are "help" numbers.
Although \( y_n \) is
completely deterministic from (11)-(12), the
sequence \( y_n \) appears random.
The mathematical
expression \( p\hbox{ mod }q \) is coded as p % q
in Python.
Use \( a=8121 \), \( c=28411 \), and \( m=134456 \). Solve the system
(11)-(12) in a function to
generate \( N \) random numbers. Make a histogram to examine the
distribution of the numbers (the \( y_n \) numbers are uniformly
distributed if the histogram is approximately flat).
Filename: diffeq_random
.
Consider the example about drawing colored balls from a hat in the section Example: Drawing balls from a hat. It could be handy to have an object that acts as a hat:
hat = Hat(red=3, blue=4, green=6)
balls = hat.draw(3)
if balls.count('red') == 1 and balls.count('green') == 2:
...
a)
Write such a class Hat
with the shown functionality.
The flexible syntax in the constructor, where the colors of the balls
and the number of balls of each color are freely specified, requires
use of a dictionary (**kwargs
) for handling a variable number of
keyword arguments, see the document Variable number of
function arguments in Python
[5].
You can borrow useful code from the balls_in_hat.py program and ideas from the section Example: Class implementation of a deck.
b)
Apply class Hat
to compute the probability of getting 2 brown
and 2 blue galls when drawing 6 balls from a hat with
6 blue, 8 brown, and 3 green balls.
Filename: Hat
.
a)
Generate a sequence of \( N \) independent random variables
with values 0 or 1 and print out this sequence without space
between the numbers (i.e., as 001011010110111010
).
b) The purpose now is to generate random zeros and ones that are dependent. If the last generated number was 0, the probability of generating a new 0 is \( p \) and a new 1 is \( 1-p \). Conversely, if the last generated was 1, the probability of generating a new 1 is \( p \) and a new 0 is \( 1-p \). Since the new value depends on the last one, we say the variables are dependent. Implement this algorithm in a function returning an array of \( N \) zeros and ones. Print out this array in the condense format as described above.
c) Choose \( N=80 \) and try the probabilities \( p=0.5 \), \( p=0.8 \) and \( p=0.9 \). Can you by visual inspection of the output characterize the differences between sequences of independent and dependent random variables?
Filename: dependent_random_numbers
.
a) Simulate flipping a coin \( N \) times.
Draw \( N \) random integers 0 and 1 using numpy.random.randint
.
b) Look at a subset \( N_1\leq N \) of the experiments in a) and compute the probability of getting a head (\( M_1/N_1 \), where \( M_1 \) is the number of heads in \( N_1 \) experiments). Choose \( N=1000 \) and print out the probability for \( N_1=10,100,500,1000 \). Generate just \( N \) numbers once in the program. How do you think the accuracy of the computed probability vary with \( N_1 \)? Is the output compatible with this expectation?
c) Now we want to study the probability of getting a head, \( p \), as a function of \( N_1 \), i.e., for \( N_1=1,\ldots,N \). A first try to compute the probability array for \( p \) is
import numpy as np
h = np.where(r <= 0.5, 1, 0)
p = np.zeros(N)
for i in range(N):
p[i] = np.sum(h[:i+1])/float(i+1)
Implement these computations in a function.
d)
An array q[i] = np.sum(h([:i]))
reflects a cumulative sum and can
be efficiently generated by np.cumsum
: q = np.cumsum(h)
.
Thereafter we can compute p
by q/I
, where I[i]=i+1
and I
can
be computed by np.arange(1,N+1)
or r_[1:N+1]
(integers 1, 2,
\( \ldots \), up to but not including N+1
). Use cumsum
to make an
alternative vectorized version of the function in c).
e) Write a test function that verifies that the implementations in c) and d) give the same results.
Use numpy.allclose
to compare two arrays.
f)
Make a function that applies the time
module to measure the relative
efficiency of the implementations in c) and d).
g)
Plot p
against I
for the case where \( N=10000 \).
Annotate the axis and the plot with relevant text.
Filename: flip_coin_prob
.
We consider so-called Bernoulli trials where the outcome of an experiment is either success, with probability \( p \), or failure, with probability \( 1-p \). The outcome of an experiment does not depend on the outcome of other experiments. The probability of getting success \( x \) times, and consequently failure \( n-x \) times, is given by $$ \begin{equation} B(x,n,p) = {n!\over x! (n-x)!} p^x(1-p)^{n-x}\tp \tag{13} \end{equation} $$
Make a general function simulate_binomial(p, n, x)
for running \( n \) experiments, where each experiment have
two outcomes, with probabilities \( p \) and \( 1-p \). The \( n \) experiments
constitute a success if the outcome with probability \( p \) occurs
exactly \( x \) times.
The simulate_binomial
function must repeat the \( n \) experiments \( N \) times. If \( M \) is the number of
successes in the \( N \) experiments, the probability estimate is
\( M/N \). Let the function return this probability estimate together
with the error
(the exact result is \eqref{sec:input:binomial}).
Simulate the three cases in ref{sec:input:ex15}
using this function.
Filename: simulate_binomial
.
Make a program for simulating the development of a poker (or simplified
poker) game among
\( n \) players. Use ideas from the section Example: Drawing cards from a deck.
Filename: poker
.
The simulation model in the section Example: Policies for limiting population growth predicts the number of individuals from generation to generation. Make a simulation of the one son policy with 10 generations, a male portion of 0.51 among newborn babies, set the fertility to 0.92, and assume that a fraction 0.06 of the population will break the law and want 6 children in a family. These parameters implies a significant growth of the population. See if you can find a factor \( r \) such that the number of individuals in generation \( n \) fulfills the difference equation $$ \begin{equation*} x_n = (1+r)x_{n-1}\tp\end{equation*} $$
Compute \( r \) for two consecutive generations \( x_{n-1} \) and \( x_n \) (\( r = x_n/x_{n-1} -1 \)) and see if \( r \) is approximately constant as \( n \) increases.
Filename: estimate_growth
.
In the game from the section Guessing a number it is smart to use the feedback from the program to track an interval \( [p,q] \) that must contain the secret number. Start with \( p=1 \) and \( q=100 \). If the user guesses at some number \( n \), update \( p \) to \( n+1 \) if \( n \) is less than the secret number (no need to care about numbers smaller than \( n+1 \)), or update \( q \) to \( n-1 \) if \( n \) is larger than the secret number (no need to care about numbers larger than \( n-1 \)).
Are there any smart strategies to pick a new guess \( s\in [p,q] \)?
To answer this question, investigate two possible
strategies: \( s \) as the midpoint in the interval \( [p, q] \),
or \( s \) as a uniformly
distributed random integer in \( [p, q] \).
Make a program that implements both strategies, i.e., the player is not
prompted for a guess but the computer computes the guess based on the
chosen strategy. Let the program run a large
number of games and see if one of the strategies can be considered as
superior in the long run.
Filename: strategies4guess
.
Vectorize the simulation program from Exercise 8: Decide if a dice game is fair
with the aid of the module numpy.random
and the numpy.sum
function.
Filename: sum9_4dice_vec
.
Use the method in the section Area computing by throwing random points to compute
\( \pi \) by computing the area of a circle.
Choose \( G \) as the circle with its center at the origin and with unit radius,
and choose \( B \) as the rectangle \( [-1,1]\times [-1,1] \).
A point \( (x,y) \) lies within \( G \) if \( x^2+y^2 < 1 \).
Compare the approximate \( \pi \) with math.pi
.
Filename: MC_pi
.
This exercise has the same purpose of computing \( \pi \) as in
Exercise 29: Compute \( \pi \) by a Monte Carlo method, but this time you should choose
\( G \) as a circle with center at \( (2,1) \) and radius 4.
Select an appropriate rectangle \( B \).
A point \( (x,y) \) lies within a circle with center at
\( (x_c,y_c) \) and with radius \( R \) if \( (x-x_c)^2 + (y-y_c)^2 < R^2 \).
Filename: MC_pi2
.
a) Let \( x_0,\ldots,x_N \) be \( N+1 \) uniformly distributed random numbers between 0 and 1. Explain why the random sum \( S_N=(N+1)^{-1}\sum_{i=0}^N 2(1-x_i^2)^{-1/2} \) is an approximation to \( \pi \).
Interpret the sum as Monte Carlo
integration and compute the corresponding integral by hand or
sympy
.
b) Compute \( S_0,S_1,\ldots,S_N \) (using just one set of \( N+1 \) random numbers). Plot this sequence versus \( N \). Also plot the horizontal line corresponding to the value of \( \pi \). Choose \( N \) large, e.g., \( N=10^6 \).
Filename: MC_pi_plot
.
Modify the walk1D.py program such
that the probability of going to the right is \( r \) and the probability
of going to the left is \( 1-r \) (draw numbers in \( [0,1) \) rather than
integers in \( \{1,2\} \)). Compute the average position of \( n_p \)
particles after 100 steps, where \( n_p \) is read from the command
line. Mathematically one can show that the average position approaches
\( rn_s - (1-r)n_s \) as \( n_p\rightarrow\infty \) (\( n_s \) is the number of
walks). Write out this exact
result together with the computed mean position with a finite number
of particles.
Filename: walk1D_drift
.
Set np=1
in the walk1Dv.py
program and modify the program to measure how many steps it takes for
one particle to reach a given point \( x=x_p \). Give \( x_p \) on the command
line. Report results for \( x_p=5, 50, 5000, 50000 \).
Filename: walk1Dv_hit_point
.
A man plays a game where the probability of winning is \( p \) and that of losing is consequently \( 1-p \). When winning he earns 1 euro and when losing he loses 1 euro. Let \( x_i \) be the man's fortune from playing this game \( i \) number of times. The starting fortune is \( x_0 \). We assume that the man gets a necessary loan if \( x_i < 0 \) such that the gaming can continue. The target is a fortune \( F \), meaning that the playing stops when \( x=F \) is reached.
a) Explain why \( x_i \) is a 1D random walk.
b) Modify one of the 1D random walk programs to simulate the average number of games it takes to reach the target fortune \( x=F \). This average must be computed by running a large number of random walks that start at \( x_0 \) and reach \( F \). Use \( x_0=10 \), \( F=100 \), and \( p=0.49 \) as example.
c) Suppose the average number of games to reach \( x=F \) is proportional to \( (F-x_0)^r \), where \( r \) is some exponent. Try to find \( r \) by experimenting with the program. The \( r \) value indicates how difficult it is to make a substantial fortune by playing this game. Note that the expected earning is negative when \( p < 0.5 \), but there is still a small probability for hitting \( x=F \).
Filename: game_as_walk1D
.
The motion of single particles can often be described as random walks. On a water surface, 1000 grains of pollen are placed in a single point. The movement of the pollen grains can be modeled by a random walk model, where for each second each grain will move a random distance, along a two-dimensional vector, whose two components are independently normally distributed with expectation 0 mm and standard deviation 0.05 mm.
a) Make a function that implements this kind of 2D random walk. Return an array with the position of each grain for each step.
b) Make a movie that shows the position of the pollen grains from 0 to 100 seconds.
c) Make a plot of the mean distance from the origin versus time. What do you see?
Filename: pollen
.
The purpose of this exercise is to reimplement the
walk2D.py
program from the section Basic implementation
with the aid of classes.
a)
Make a class Particle
with the coordinates \( (x,y) \)
and the time step number of a particle as data attributes.
A method move
moves the particle in one of the four directions
and updates the \( (x,y) \) coordinates.
Another class, Particles
, holds a list of Particle
objects and a plotstep
parameter (as in
walk2D.py).
A method move
moves all the particles one step, a
method plot
can make a plot of all particles, while
a method moves
performs a loop over time steps and
calls move
and plot
in each step.
b)
Equip the Particle
and Particles
classes with print functionality
such that one can print out all particles in a nice way by saying
print p
(for a Particles
instance p
) or print self
(inside a
method).
In __str__
, apply the pformat
function from the
pprint
module to the list of particles, and make sure that
__repr__
just reuse __str__
in both classes so the output looks nice.
c)
Make a test function that compares the first three positions of four
particles with the corresponding results computed by the
walk2D.py
program. The
seed of the random number generator must of course be fixed
identically in the two programs.
d)
Organize the complete code as a module such that the classes
Particle
and Particles
can be reused in other programs.
The test block should read the number of particles from the
command line and perform a simulation.
e)
Compare the efficiency of the class version against the vectorized
version in walk2Dv.py
, using the techniques in the
document Evaluating the efficiency of Python programs [6].
f)
The program developed above cannot be
vectorized as long as we base the implementation on class
Particle
. However, if we remove that class and focus on class
Particles
, the latter can employ arrays for holding the positions of
all particles and vectorized updates of these positions in the moves
method. Use ideas from the
walk2Dv.py
program to make a new class Particles_vec
which vectorizes
Particles
.
g)
Verify the code against the walk2Dv.py
program as explained in c).
Automate the verification in a test function.
h)
Write a Python function that measures the computational
efficiency the
vectorized class Particles_vec
and the scalar class Particles
.
Filename: walk2D_class
.
Modify the walk2D.py
or walk2Dc.py
programs from Exercise 36: Make classes for 2D random walk so that the walkers cannot walk outside a
rectangular area \( A = [x_L,x_H]\times [y_L,y_H] \). Do not move the
particle if its new position is outside \( A \).
Filename: walk2D_barrier
.
Modify the walk2Dv.py
program so that the walkers cannot
walk outside a rectangular area \( A = [x_L,x_H]\times [y_L,y_H] \).
First perform the moves of one direction. Then test if new positions are outside \( A \). Such a test returns a boolean array that can be used as index in the position arrays to pick out the indices of the particles that have moved outside \( A \) and move them back to the relevant boundary of \( A \).
Filename: walk2Dv_barrier
.
Suppose we have a box with a wall dividing the box into two equally sized parts. In one part we have a gas where the molecules are uniformly distributed in a random fashion. At \( t=0 \) we remove the wall. The gas molecules will now move around and eventually fill the whole box.
This physical process can be simulated by a 2D random walk inside
a fixed area \( A \) as introduced in Exercise 37: 2D random walk with walls; scalar version
and Exercise 38: 2D random walk with walls; vectorized version (in reality the motion is three-dimensional,
but we only simulate the two-dimensional part of it since we already
have programs for doing this).
Use the program from either
Exercise 37: 2D random walk with walls; scalar version
or Exercise 38: 2D random walk with walls; vectorized version to simulate the process for \( A=[0,1]\times[0,1] \).
Initially, place 10000 particles at uniformly distributed random positions in
\( [0,1/2]\times [0,1] \).
Then start the random walk and visualize what happens.
Simulate for a long time and make a hardcopy of the animation
(an animated GIF file, for instance).
Is the end result what you would expect?
Filename: disorder1
.
Molecules tend to move randomly because of collisions and forces between molecules. We do not model collisions between particles in the random walk, but the nature of this walk, with random movements, simulates the effect of collisions. Therefore, the random walk can be used to model molecular motion in many simple cases. In particular, the random walk can be used to investigate how a quite ordered system, where one gas fills one half of a box, evolves through time to a more disordered system.
Solve Exercise 39: Simulate mixing of gas molecules when the wall dividing the
box is not completely removed, but instead has a small
hole.
Filename: disorder2
.
You are presented \( n \) glasses of beer, each containing a different brand. You are informed that there are \( m\geq n \) possible brands in total, and the names of all brands are given. For each glass, you can pay \( p \) euros to taste the beer, and if you guess the right brand, you get \( q\geq p \) euros back. Suppose you have done this before and experienced that you typically manage to guess the right brand \( T \) times out of 100, so that your probability of guessing the right brand is \( b=T/100 \).
Make a function simulate(m, n, p, q, b)
for simulating the
beer tasting process. Let the function return the amount of money earned
and how many correct guesses (\( \leq n \)) you made. Call simulate
a large number of times and compute the average earnings and
the probability of getting full score in the case \( m=n=4 \), \( p=3 \),
\( q=6 \), and \( b=1/m \) (i.e., four glasses with four brands, completely
random guessing, and a payback of twice as much as the cost).
How much more can you earn from this game if your ability to guess the
right brand is better,
say \( b=1/2 \)?
Filename: simulate_beer_tasting
.
A common mathematical model for the evolution of stock prices can be formulated as a difference equation $$ \begin{equation} x_n = x_{n-1} + \Delta t \mu x_{n-1} + \sigma x_{n-1}\sqrt{\Delta t}r_{n-1}, \tag{14} \end{equation} $$ where \( x_n \) is the stock price at time \( t_n \), \( \Delta t \) is the time interval between two time levels (\( \Delta t = t_{n}-t_{n-1} \)), \( \mu \) is the growth rate of the stock price, \( \sigma \) is the volatility of the stock price, and \( r_0,\ldots,r_{n-1} \) are normally distributed random numbers with mean zero and unit standard deviation. An initial stock price \( x_0 \) must be prescribed together with the input data \( \mu \), \( \sigma \), and \( \Delta t \).
We can make a remark that (14) is a Forward Euler discretization of a stochastic differential equation for a continuous price function \( x(t) \): $$ \begin{equation*} \frac{dx}{dt} = \mu x + \sigma N(t),\end{equation*} $$ where \( N(t) \) is a so-called white noise random time series signal. Such equations play a central role in modeling of stock prices.
Make \( R \) realizations of (14) for \( n=0,\ldots,N \)
for \( N=5000 \) steps over a time period of \( T=180 \) days with
a step size \( \Delta t = T/N \).
Filename: stock_prices
.
In this exercise we are going to consider the pricing of so-called Asian options. An Asian option is a financial contract where the owner earns money when certain market conditions are satisfied.
The contract is specified by a strike price \( K \) and a maturity time \( T \). It is written on the average price of the underlying stock, and if this average is bigger than the strike \( K \), the owner of the option will earn the difference. If, on the other hand, the average becomes less, the owner receives nothing, and the option matures in the value zero. The average is calculated from the last trading price of the stock for each day.
From the theory of options in finance, the price of the Asian option will be the expected present value of the payoff. We assume the stock price dynamics given as, $$ \begin{equation} S(t+1)=(1+r)S(t)+\sigma S(t)\epsilon(t), \tag{15} \end{equation} $$ where \( r \) is the interest-rate, and \( \sigma \) is the volatility of the stock price. The time \( t \) is supposed to be measured in days, \( t=0,1,2,\ldots \), while \( \epsilon(t) \) are independent identically distributed normal random variables with mean zero and unit standard deviation. To find the option price, we must calculate the expectation $$ \begin{equation} p=(1+r)^{-T}\hbox{E}\left[\max\left(\frac1T\sum_{t=1}^{T}S(t)-K,0\right)\right] \tp \tag{16} \end{equation} $$ The price is thus given as the expected discounted payoff. We will use Monte Carlo simulations to estimate the expectation. Typically, \( r \) and \( \sigma \) can be set to \( r=0.0002 \) and \( \sigma=0.015 \). Assume further \( S(0)=100 \).
a) Make a function that simulates a path of \( S(t) \), that is, the function computes \( S(t) \) for \( t=1,\ldots,T \) for a given \( T \) based on the recursive definition in (15). The function should return the path as an array.
b) Create a function that finds the average of \( S(t) \) from \( t=1 \) to \( t=T \). Make another function that calculates the price of the Asian option based on \( N \) simulated averages. You may choose \( T=100 \) days and \( K=102 \).
c) Plot the price \( p \) as a function of \( N \). You may start with \( N=1000 \).
d) Plot the error in the price estimation as a function \( N \) (assume that the \( p \) value corresponding to the largest \( N \) value is the "right" price). Try to fit a curve of the form \( c/\sqrt{N} \) for some \( c \) to this error plot. The purpose is to show that the error is reduced as \( 1/\sqrt{N} \).
Filename: option_price
.
If you wonder where the values for \( r \) and \( \sigma \) come from, you will find the explanation in the following. A reasonable level for the yearly interest-rate is around 5 percent, which corresponds to a daily rate \( 0.05/250=0.0002 \). The number 250 is chosen because a stock exchange is on average open this amount of days for trading. The value for \( \sigma \) is calculated as the volatility of the stock price, corresponding to the standard deviation of the daily returns of the stock defined as \( (S(t+1)-S(t))/S(t) \). "Normally", the volatility is around 1.5 percent a day. Finally, there are theoretical reasons why we assume that the stock price dynamics is driven by \( r \), meaning that we consider the risk-neutral dynamics of the stock price when pricing options. There is an exciting theory explaining the appearance of \( r \) in the dynamics of the stock price. If we want to simulate a stock price dynamics mimicing what we see in the market, \( r \) in (15) must be substituted with \( \mu \), the expected return of the stock. Usually, \( \mu \) is higher than \( r \).
In a laboratory experiment waves are generated through the impact of a model slide into a wave tank. (The intention of the experiment is to model a future tsunami event in a fjord, generated by loose rocks that fall into the fjord.) At a certain location, the elevation of the surface, denoted by \( \eta \), is measured at discrete points in time using an ultra-sound wave gauge. The result is a time series of vertical positions of the water surface elevations in meter: \( \eta (t_0), \eta(t_1), \eta(t_2),\ldots,\eta(t_n) \). There are 300 observations per second, meaning that the time difference between two neighboring measurement values \( \eta(t_i) \) and \( \eta (t_{i+1}) \) is \( h=1/300 \) second.
a)
Read the \( \eta \) values in the file
gauge.dat into an array eta
.
Read \( h \) from the command line.
b)
Plot eta
versus the time values.
c) Compute the velocity \( v \) of the surface by the formula $$ v_i \approx (\eta_{i+1}-\eta_{i-1})/(2h), \quad i=1,\ldots,n-1\tp$$ Plot \( v \) versus time values in a separate plot.
d) Compute the acceleration \( a \) of the surface by the formula $$ a_i \approx (\eta_{i+1}-2\eta_i + \eta_{i-1})/h^2,\quad i=1,\ldots,n-1. $$ Plot \( a \) versus the time values in a separate plot.
e)
If we have a noisy signal \( \eta_i \), where \( i=0,\ldots,n \) counts time levels,
the noise can be
reduced by computing a new signal where the value at a point is a weighted
average of the values at that point and the neighboring points at each side.
More precisely, given the signal \( \eta_i \), \( i=0,\ldots,n \),
we compute a filtered (averaged)
signal with values \( \eta^{(1)}_i \) by the formula
$$
\begin{equation}
\eta^{(1)}_i = \frac{1}{4}({\eta_{i+1} + 2\eta_i + \eta_{i-1}}),\quad
i=1,\ldots,n-1,\ \eta^{(1)}_0=\eta_0,\ \eta^{(1)}_n=\eta_n\tp
\tag{17}
\end{equation}
$$
Make a function filter
that takes the \( \eta_i \) values in an array eta
as input and returns the filtered \( \eta^{(1)}_i \) values in an array.
f)
Let \( \eta^{(k)}_i \) be the signal arising by applying the filtered
function \( k \) times to the same signal.
Make a plot with curves \( \eta_i \) and the filtered \( \eta^{(k)}_i \) values
for \( k=1, 10, 100 \). Make similar plots for the velocity and acceleration
where these are made from both the original, measured
\( \eta \) data and the filtered
data. Discuss the results.
Filename: labstunami
.
The purpose of this exercise is to look into numerical differentiation of time series signals that contain measurement errors. This insight might be helpful when analyzing the noise in real data from a laboratory experiment in Exercise 44: Differentiate noise measurements.
a)
Compute a signal
$$
\begin{equation*} \bar\eta_i = A\sin ({2\pi\over T} t_i),\quad t_i=i{T\over 40},\ i=0,\ldots,200\tp\end{equation*}
$$
Display \( \bar\eta_i \) versus time \( t_i \) in a plot. Choose \( A=1 \) and
\( T=2\pi \). Store the \( \bar\eta \) values in an array etabar
.
b)
Compute a signal with random noise \( E_i \),
$$
\begin{equation*} \eta_i = \bar\eta_i + E_i,\end{equation*}
$$
\( E_i \) is drawn from the normal distribution with mean zero and
standard deviation \( \sigma = 0.04A \). Plot this \( \eta_i \)
signal as circles in
the same plot as \( \bar\eta_i \). Store the \( E_i \) in an array E
for
later use.
c)
Compute the first derivative of \( \bar\eta_i \) by the formula
$$
\begin{equation*} {\bar\eta_{i+1}-\bar\eta_{i-1}\over 2h}, \quad i=1,\ldots,n-1, \end{equation*}
$$
and store the values in an array detabar
.
Display the graph.
d)
Compute the first derivative of the error term by the formula
$$
\begin{equation*} {E_{i+1}-E_{i-1}\over 2h}, \quad i=1,\ldots,n-1, \end{equation*}
$$
and store the values in an array dE
. Calculate the mean and
the standard deviation of dE
.
e)
Plot detabar
and
detabar + dE
. Use the result of the standard deviation
calculations to explain the qualitative features of the graphs.
f)
The second derivative of a time signal \( \eta_i \) can be
computed by
$$
\begin{equation*}
{\eta_{i+1}-2\eta_i + \eta_{i-1}\over h^2},
\quad i=1,\ldots,n-1\tp
\end{equation*}
$$
Use this formula on the etabar
data and save the result in
d2etabar
. Also apply the formula to the E
data and save
the result in d2E
. Plot d2etabar
and d2etabar + d2E
.
Compute the standard deviation of d2E
and compare with the
standard deviation of dE
and E
. Discuss the plot in light
of these standard deviations.
Filename: sine_noise
.
We assume that the measured data can be modeled as a smooth time signal \( \bar\eta (t) \) plus a random variation \( E(t) \). Computing the velocity of \( \eta = \bar\eta + E \) results in a smooth velocity from the \( \bar\eta \) term and a noisy signal from the \( E \) term.
a) We can estimate the level of noise in the first derivative of \( E \) as follows. The random numbers \( E(t_i) \) are assumed to be independent and normally distributed with mean zero and standard deviation \( \sigma \). It can then be shown that $$ \begin{equation*} {E_{i+1}-E_{i-1}\over 2h}\end{equation*} $$ produces numbers that come from a normal distribution with mean zero and standard deviation \( 2^{-1/2}h^{-1}\sigma \). How much is the original noise, reflected by \( \sigma \), magnified when we use this numerical approximation of the velocity?
b) The fraction $$ \begin{equation*}{E_{i+1}-2E_i + E_{i-1}\over h^2}\end{equation*} $$ will also generate numbers from a normal distribution with mean zero, but this time with standard deviation \( 2h^{-2}\sigma \). Find out how much the noise is magnified in the computed acceleration signal.
c)
The numbers in the gauge.dat
file in Exercise 44: Differentiate noise measurements
are given with 5 digits. This is no
certain indication of the accuracy of the measurements, but as a test
we may assume \( \sigma \) is of the order \( 10^{-4} \). Check if the
visual results for the velocity and acceleration are consistent with
the standard deviation of the noise in these signals as modeled above.
The functions transition
and mutate_via_markov_chain
from the section Random mutations of genes were made for being easy to read and understand. Upon
closer inspection, we realize that the transition
function
constructs the interval_limits
every time a random transition is to
be computed, and we want to run a large number of transitions. By
merging the two functions, pre-computing interval limits for each
from_base
, and adding a loop over N
mutations, one can
reduce the computation of interval limits to a minimum. Perform such
an efficiency enhancement. Measure the CPU time of this new function
versus the mutate_via_markov_chain
function for 1 million mutations.
Filename: markov_chain_mutation2
.