This chapter is taken from the book A Primer on Scientific Programming with Python by H. P. Langtangen, 5th edition, Springer, 2016.

Exercises

Exercise 1: Flip a coin times

Make a program that simulates flipping a coin $ N $ times. Print out tail or head for each flip and let the program count and print the number of heads.

Hint.

Use r = random.random() and define head as r <= 0.5, or draw an integer among $ \{0,1\} $ with r = random.randint(0,1) and define head when r is 0.

Filename: flip_coin.

Exercise 2: Compute a probability

What is the probability of getting a number between 0.5 and 0.6 when drawing uniformly distributed random numbers from the interval $ [0,1) $? To answer this question empirically, let a program draw $ N $ such random numbers using Python's standard random module, count how many of them, $ M $, that fall in the interval $ [0.5,0.6] $, and compute the probability as $ M/N $. Run the program with the four values $ N=10^{i} $ for $ i=1,2,3,6 $. Filename: compute_prob.

Exercise 3: Choose random colors

Suppose we have eight different colors. Make a program that chooses one of these colors at random and writes out the color.

Hint.

Use a list of color names and use the choice function in the random module to pick a list element.

Filename: choose_color.

Exercise 4: Draw balls from a hat

Suppose there are 40 balls in a hat, of which 10 are red, 10 are blue, 10 are yellow, and 10 are purple. What is the probability of getting two blue and two purple balls when drawing 10 balls at random from the hat? Filename: draw_10balls.

Exercise 5: Computing probabilities of rolling dice

This exercise deals with four questions:

You throw a die. What is the probability of getting a 6?
You throw a die four times in a row. What is the probability of getting 6 all the times?
Suppose you have thrown the die three times with 6 coming up all times. What is the probability of getting a 6 in the fourth throw?
Suppose you have thrown the die 100 times and experienced a 6 in every throw. What do you think about the probability of getting a 6 in the next throw?

First try to solve the questions from a theoretical or common sense point of view. Thereafter, make functions for simulating cases 1, 2, and 3. Filename: rolling_dice.

Exercise 6: Estimate the probability in a dice game

Make a program for estimating the probability of getting at least one die with six eyes when throwing $ n $ dice. Read $ n $ and the number of experiments from the command line.

As a partial verification, compare the Monte Carlo simulation results to the exact answer 11/36 for $ n=2 $ and observe that the approximate probabilities approach the exact probability as the number of simulations grow. Filename: one6_ndice.

Exercise 7: Compute the probability of hands of cards

Use the Deck.py module (see the section Example: Class implementation of a deck) and the same_rank and same_suit functions from the cards module (see the section Example: Drawing cards from a deck) to compute the following probabilities by Monte Carlo simulation:

exactly two pairs among five cards,
four or five cards of the same suit among five cards,
four-of-a-kind among five cards.

Filename: card_hands.

Exercise 8: Decide if a dice game is fair

Somebody suggests the following game. You pay 1 euro and are allowed to throw four dice. If the sum of the eyes on the dice is less than 9, you get paid $ r $ euros back, otherwise you lose the 1 euro investment. Assume $ r=10 $. Will you then in the long run win or lose money by playing this game? Answer the question by making a program that simulates the game. Read $ r $ and the number of experiments $ N $ from the command line. Filename: sum_4dice.

Exercise 9: Adjust a game to make it fair

It turns out that the game in Exercise 8: Decide if a dice game is fair is not fair, since you lose money in the long run. The purpose of this exercise is to adjust the winning award so that the game becomes fair, i.e., that you neither lose nor win money in the long run.

Make a Python function that computes the probability $ p $ of getting a sum less than $ s $ when rolling $ n $ dice. Use the reasoning in the section Example: Throwing dice to find the award per game, $ r $, that makes the game fair. Run the program from Exercise 8: Decide if a dice game is fair with this $ r $ on the command line and verify that the game is now (approximately) fair. Filename: sum_ndice_fair.

Exercise 10: Make a test function for Monte Carlo simulation

We consider the Python function in Exercise 9: Adjust a game to make it fair for computing a probability $ p $ that the sum of the eyes on $ n $ dice is less than $ s $. The aim is to write a test function for verifying the computation of $ p $.

a) Find some combinations of $ n $ and $ s $ that must result in $ p=0 $ and $ p=1 $ and make the appropriate code in the test function.

b) Fix the seed of the random number generator and record the first eight random numbers to 16 digits. Set $ n=2 $, perform four experiments, and compute by hand what the probability estimate becomes (choose any appropriate $ s $). Write the necessary code in the test function to compare this manually calculated result with the what is produced by the function from Exercise 9: Adjust a game to make it fair.

Filename: test_sum_ndice.

Exercise 11: Generalize a game

Consider the game in the section Example: Throwing dice. A generalization is to think as follows: you throw one die until the number of eyes is less than or equal to the previous throw. Let $ m $ be the number of throws in a game.

a) Use Monte Carlo simulation to compute the probability of getting $ m=2,3,4,\ldots $.

Hint.

For $ m\geq 6 $ the throws must be exactly $ 1,2,3,4,5,6,6,6,\ldots $, and the probability of each is 1/6, giving the total probability $ 6^{-m} $. Use $ N=10^6 $ experiments as this should suffice to estimate the probabilities for $ m\leq 5 $, and beyond that we have the analytical expression.

b) If you pay 1 euro to play this game, what is the fair amount to get paid when win? Answer this question for each of the cases $ m=2,3,4,5 $.

Filename: incr_eyes.

Exercise 12: Compare two playing strategies

Suggest a player strategy for the game in the section Rolling two dice. Remove the question in the player_guess function in the file ndice2.py, and implement the chosen strategy instead. Let the program play a large number of games, and record the number of times the computer wins. Which strategy is best in the long run: the computer's or yours? Filename: simulate_strategies1.

Exercise 13: Investigate strategies in a game

Extend the program from Exercise 12: Compare two playing strategies such that the computer and the player can use a different number of dice. Let the computer choose a random number of dice between 2 and 20. Experiment to find out if there is a favorable number of dice for the player. Filename: simulate_strategies2.

Exercise 14: Investigate the winning chances of some games

An amusement park offers the following game. A hat contains 20 balls: 5 red, 5 yellow, 3 green, and 7 brown. At a cost of $ 2n $ euros you can draw $ 4 \leq n\leq 10 $ balls at random from the hat (without putting them back). Before you are allowed to look at the drawn balls, you must choose one of the following options:

win $ 60 $ euros if you have drawn exactly three red balls
win $ 7 + 5\sqrt{n} $ euros if you have drawn at least three brown balls
win $ n^3-26 $ euros if you have drawn exactly one yellow ball and one brown ball
win $ 23 $ euros if you have drawn at least one ball of each color

For each of the $ 4n $ different types of games you can play, compute the net income (per play) and the probability of winning. Is there any of the games (i.e., any combinations of $ n $ and the four options above) where you will win money in the long run? Filename: draw_balls.

Exercise 15: Compute probabilities of throwing two dice

Throw two dice a large number of times in a program. Record the sum of the eyes each time and count how many times each of the possibilities for the sum (2, 3, $ \ldots $, 12) appear. Compute the corresponding probabilities and compare them with the exact values. (To find the exact probabilities, set up all the $ 6\times 6 $ possible outcomes of throwing two dice, and then count how many of them that has a sum $ s $ for $ s=2,3,\ldots,12 $.) Filename: freq_2dice.

Exercise 16: Vectorize flipping a coin

Simulate flipping a coin $ N $ times and write out the number of tails. The code should be vectorized, i.e., there must be no loops in Python.

Hint.

Constructions like numpy.where(r<=0.5, 1, 0) combined with numpy.sum, or r[r<=0.5].size, are useful, where r is an array of random numbers between 0 and 1.

Filename: flip_coin_vec.

Exercise 17: Vectorize a probablility computation

The purpose of this exercise is to speed up the code in Exercise 2: Compute a probability by vectorization.

Hint.

For an array r of uniformly distributed random numbers on $ [0,1) $, make use of r1 = r[r>0.5] and r1[r1<0.6]. An alternative is numpy.where combine with a compound boolean expression with numpy.logical_and(0.5>=r, r<=0.6). See the discussion of this topic in the document Array computing and curve plotting [4].

Filename: compute_prob_vec.

Exercise 18: Throw dice and compute a small probability

Use Monte Carlo simulation to compute the probability of getting 6 eyes on all dice when rolling $ 7 $ dice.

Hint.

You need a large number of experiments in this case because of the small probability (see the first paragraph of the section Computing probabilities), so a vectorized implementation may be important.

Filename: roll_7dice.

Exercise 19: Is democracy reliable as a decision maker?

A democracy takes decisions based on majority votes. We shall investigate if this is a good idea or if a single person would produce better decisions.

We shall ask about pure facts, not opinions. This means that the question to be answered by a population has a definite "yes" or "no" answer. For example, "Can Python lists contain tuples as elements?" The correct answer is "yes". Asking a population such a question and relying on the majority of votes, is a reliable procedure if the competence level in the population is sufficiently high.

a) Assume that the competence level in a population can be modeled by a probability $ p $ such that if you ask $ N $ people a question, $ M=pN $ of them will give the correct answer (as $ N\rightarrow\infty $). Here we make the questionable assumption of a homogeneous population, in the sense that $ p $ is the same for every individual.

Make a function homogeneous(p, N) for simulating whether the majority vote of a population of $ N $ individuals arrives at the right answer, if the probability of answering correctly is $ p $ for an individual. Make another function homogeneous_ex() that runs 10 tests the specific case of $ N=5 $ (as when relying on the majority of a student group) and 10 tests when asking a whole city of $ N=1,000,000 $ voters. Try $ p=0.49 $, $ p=0.51 $, and $ p=0.8 $. Are the results as you would expect from intuition?

Hint.

Asking one individual is like flipping a biased coin that has probability $ p $ of giving head (right answer) and probability $ 1-p $ of giving tail (wrong answer).

b) The problem in a) can be exactly solved, since each question is a Bernoulli trial with success probability $ p $, and the probability of a correct majority vote is the same as the probability of getting $ N/2 $ or more successes in $ N $ trials. For large $ N $, the probability of $ M $ successes in $ N $ trials can be well approximated by a normal (Gaussian) density function: $$ g(M) = (\sqrt{2\pi} Np(1-p))^{-1}\exp{(-\frac{1}{2}(M-Np)^2/(Np(1-p)))}\tp $$ The majority vote is correct when $ M>N/2 $, and the probability of this event is given by $ 1-\Phi(N/2) $, where $ \Phi $ is the cumulative normal distribution with mean $ Np $ and variance $ Np(1-p) $.

Plot the probability of being right against $ p $.

Say 5 questions are of importance. What competence level $ p $ does a king need to have all 5 right compared to the population having all 5 right.

c) We shall now simulate voting in a heterogeneous population. The probability that an individual no. $ i $ answers correctly is $ p_i $, where $ p_i $ is drawn from a normal (Gaussian) probability density with mean $ p $ and standard deviation $ s $. The competence level varies between individuals, with $ s $ expressing the spreading of knowledge and $ p $ the mean competence level.

Make function heterogeneous(p, N, s) for returning whether the majority vote is right or wrong in the heterogeneous case. Rerun the examples from a) with $ s=0.2 $.

d) With a somewhat large variation of the population, i.e., $ s $ somewhat large, there will be some individuals that always provide wrong or right answers according to this model. To learn about reasonable values $ s $ we can investigate unreasonable large amounts of people who are always right or wrong.

The probability of always being wrong is the probability of $ p_i < 0 $. This is given by $ \Phi(-p/s) $, where $ \Phi $ is the cumulative normal distribution with mean zero and unit standard deviation. It can be reached in Python as scipy.stats.norm.cdf. The probability of always being right is the probability of $ p_i>1 $, which can be computed as $ 1-\Phi((1-p)/s) $. Plot curves of the probability of always being right and always wrong against $ s\in [0.1,0.6] $. Perform this curve plotting in a function extremes(p).

Filename: democracy.

Exercise 20: Difference equation for random numbers

Simple random number generators are based on simulating difference equations. Here is a typical set of two equations: $$ \begin{align} x_n &= (ax_{n-1} + c) \hbox{ mod } m, \tag{11}\\ y_n &= x_n/m, \tag{12} \end{align} $$ for $ n=1,2,\ldots $. A seed $ x_0 $ must be given to start the sequence. The numbers $ y_1,y_2,\ldots $ represent the random numbers and $ x_0,x_1,\ldots $ are "help" numbers. Although $ y_n $ is completely deterministic from (11)-(12), the sequence $ y_n $ appears random. The mathematical expression $ p\hbox{ mod }q $ is coded as p % q in Python.

Use $ a=8121 $, $ c=28411 $, and $ m=134456 $. Solve the system (11)-(12) in a function to generate $ N $ random numbers. Make a histogram to examine the distribution of the numbers (the $ y_n $ numbers are uniformly distributed if the histogram is approximately flat). Filename: diffeq_random.

Exercise 21: Make a class for drawing balls from a hat

Consider the example about drawing colored balls from a hat in the section Example: Drawing balls from a hat. It could be handy to have an object that acts as a hat:

hat = Hat(red=3, blue=4, green=6)
balls = hat.draw(3)
if balls.count('red') == 1 and balls.count('green') == 2:
    ...

a) Write such a class Hat with the shown functionality.

Hint 1.

The flexible syntax in the constructor, where the colors of the balls and the number of balls of each color are freely specified, requires use of a dictionary (**kwargs) for handling a variable number of keyword arguments, see the document Variable number of function arguments in Python [5].

Hint 2.

You can borrow useful code from the balls_in_hat.py program and ideas from the section Example: Class implementation of a deck.

b) Apply class Hat to compute the probability of getting 2 brown and 2 blue galls when drawing 6 balls from a hat with 6 blue, 8 brown, and 3 green balls.

Filename: Hat.

Exercise 22: Independent versus dependent random numbers

a) Generate a sequence of $ N $ independent random variables with values 0 or 1 and print out this sequence without space between the numbers (i.e., as 001011010110111010).

b) The purpose now is to generate random zeros and ones that are dependent. If the last generated number was 0, the probability of generating a new 0 is $ p $ and a new 1 is $ 1-p $. Conversely, if the last generated was 1, the probability of generating a new 1 is $ p $ and a new 0 is $ 1-p $. Since the new value depends on the last one, we say the variables are dependent. Implement this algorithm in a function returning an array of $ N $ zeros and ones. Print out this array in the condense format as described above.

c) Choose $ N=80 $ and try the probabilities $ p=0.5 $, $ p=0.8 $ and $ p=0.9 $. Can you by visual inspection of the output characterize the differences between sequences of independent and dependent random variables?

Filename: dependent_random_numbers.

Exercise 23: Compute the probability of flipping a coin

a) Simulate flipping a coin $ N $ times.

Hint.

Draw $ N $ random integers 0 and 1 using numpy.random.randint.

b) Look at a subset $ N_1\leq N $ of the experiments in a) and compute the probability of getting a head ($ M_1/N_1 $, where $ M_1 $ is the number of heads in $ N_1 $ experiments). Choose $ N=1000 $ and print out the probability for $ N_1=10,100,500,1000 $. Generate just $ N $ numbers once in the program. How do you think the accuracy of the computed probability vary with $ N_1 $? Is the output compatible with this expectation?

c) Now we want to study the probability of getting a head, $ p $, as a function of $ N_1 $, i.e., for $ N_1=1,\ldots,N $. A first try to compute the probability array for $ p $ is

import numpy as np
h = np.where(r <= 0.5, 1, 0)
p = np.zeros(N)
for i in range(N):
    p[i] = np.sum(h[:i+1])/float(i+1)

Implement these computations in a function.

d) An array q[i] = np.sum(h([:i])) reflects a cumulative sum and can be efficiently generated by np.cumsum: q = np.cumsum(h). Thereafter we can compute p by q/I, where I[i]=i+1 and I can be computed by np.arange(1,N+1) or r_[1:N+1] (integers 1, 2, $ \ldots $, up to but not including N+1). Use cumsum to make an alternative vectorized version of the function in c).

e) Write a test function that verifies that the implementations in c) and d) give the same results.

Hint.

Use numpy.allclose to compare two arrays.

f) Make a function that applies the time module to measure the relative efficiency of the implementations in c) and d).

g) Plot p against I for the case where $ N=10000 $. Annotate the axis and the plot with relevant text.

Filename: flip_coin_prob.

Exercise 24: Simulate binomial experiments

We consider so-called Bernoulli trials where the outcome of an experiment is either success, with probability $ p $, or failure, with probability $ 1-p $. The outcome of an experiment does not depend on the outcome of other experiments. The probability of getting success $ x $ times, and consequently failure $ n-x $ times, is given by $$ \begin{equation} B(x,n,p) = {n!\over x! (n-x)!} p^x(1-p)^{n-x}\tp \tag{13} \end{equation} $$

Make a general function simulate_binomial(p, n, x) for running $ n $ experiments, where each experiment have two outcomes, with probabilities $ p $ and $ 1-p $. The $ n $ experiments constitute a success if the outcome with probability $ p $ occurs exactly $ x $ times. The simulate_binomial function must repeat the $ n $ experiments $ N $ times. If $ M $ is the number of successes in the $ N $ experiments, the probability estimate is $ M/N $. Let the function return this probability estimate together with the error (the exact result is \eqref{sec:input:binomial}). Simulate the three cases in ref{sec:input:ex15} using this function. Filename: simulate_binomial.

Exercise 25: Simulate a poker game

Make a program for simulating the development of a poker (or simplified poker) game among $ n $ players. Use ideas from the section Example: Drawing cards from a deck. Filename: poker.

Exercise 26: Estimate growth in a simulation model

The simulation model in the section Example: Policies for limiting population growth predicts the number of individuals from generation to generation. Make a simulation of the one son policy with 10 generations, a male portion of 0.51 among newborn babies, set the fertility to 0.92, and assume that a fraction 0.06 of the population will break the law and want 6 children in a family. These parameters implies a significant growth of the population. See if you can find a factor $ r $ such that the number of individuals in generation $ n $ fulfills the difference equation $$ \begin{equation*} x_n = (1+r)x_{n-1}\tp\end{equation*} $$

Hint.

Compute $ r $ for two consecutive generations $ x_{n-1} $ and $ x_n $ ($ r = x_n/x_{n-1} -1 $) and see if $ r $ is approximately constant as $ n $ increases.

Filename: estimate_growth.

Exercise 27: Investigate guessing strategies

In the game from the section Guessing a number it is smart to use the feedback from the program to track an interval $ [p,q] $ that must contain the secret number. Start with $ p=1 $ and $ q=100 $. If the user guesses at some number $ n $, update $ p $ to $ n+1 $ if $ n $ is less than the secret number (no need to care about numbers smaller than $ n+1 $), or update $ q $ to $ n-1 $ if $ n $ is larger than the secret number (no need to care about numbers larger than $ n-1 $).

Are there any smart strategies to pick a new guess $ s\in [p,q] $? To answer this question, investigate two possible strategies: $ s $ as the midpoint in the interval $ [p, q] $, or $ s $ as a uniformly distributed random integer in $ [p, q] $. Make a program that implements both strategies, i.e., the player is not prompted for a guess but the computer computes the guess based on the chosen strategy. Let the program run a large number of games and see if one of the strategies can be considered as superior in the long run. Filename: strategies4guess.

Exercise 28: Vectorize a dice game

Vectorize the simulation program from Exercise 8: Decide if a dice game is fair with the aid of the module numpy.random and the numpy.sum function. Filename: sum9_4dice_vec.

Exercise 29: Compute $ \pi $ by a Monte Carlo method

Use the method in the section Area computing by throwing random points to compute $ \pi $ by computing the area of a circle. Choose $ G $ as the circle with its center at the origin and with unit radius, and choose $ B $ as the rectangle $ [-1,1]\times [-1,1] $. A point $ (x,y) $ lies within $ G $ if $ x^2+y^2 < 1 $. Compare the approximate $ \pi $ with math.pi. Filename: MC_pi.

Exercise 30: Compute $ \pi $ by a Monte Carlo method

This exercise has the same purpose of computing $ \pi $ as in Exercise 29: Compute $ \pi $ by a Monte Carlo method, but this time you should choose $ G $ as a circle with center at $ (2,1) $ and radius 4. Select an appropriate rectangle $ B $. A point $ (x,y) $ lies within a circle with center at $ (x_c,y_c) $ and with radius $ R $ if $ (x-x_c)^2 + (y-y_c)^2 < R^2 $. Filename: MC_pi2.

Exercise 31: Compute $ \pi $ by a random sum

a) Let $ x_0,\ldots,x_N $ be $ N+1 $ uniformly distributed random numbers between 0 and 1. Explain why the random sum $ S_N=(N+1)^{-1}\sum_{i=0}^N 2(1-x_i^2)^{-1/2} $ is an approximation to $ \pi $.

Hint.

Interpret the sum as Monte Carlo integration and compute the corresponding integral by hand or sympy.

b) Compute $ S_0,S_1,\ldots,S_N $ (using just one set of $ N+1 $ random numbers). Plot this sequence versus $ N $. Also plot the horizontal line corresponding to the value of $ \pi $. Choose $ N $ large, e.g., $ N=10^6 $.

Filename: MC_pi_plot.

Exercise 32: 1D random walk with drift

Modify the walk1D.py program such that the probability of going to the right is $ r $ and the probability of going to the left is $ 1-r $ (draw numbers in $ [0,1) $ rather than integers in $ \{1,2\} $). Compute the average position of $ n_p $ particles after 100 steps, where $ n_p $ is read from the command line. Mathematically one can show that the average position approaches $ rn_s - (1-r)n_s $ as $ n_p\rightarrow\infty $ ($ n_s $ is the number of walks). Write out this exact result together with the computed mean position with a finite number of particles. Filename: walk1D_drift.

Exercise 33: 1D random walk until a point is hit

Set np=1 in the walk1Dv.py program and modify the program to measure how many steps it takes for one particle to reach a given point $ x=x_p $. Give $ x_p $ on the command line. Report results for $ x_p=5, 50, 5000, 50000 $. Filename: walk1Dv_hit_point.

Exercise 34: Simulate making a fortune from gaming

A man plays a game where the probability of winning is $ p $ and that of losing is consequently $ 1-p $. When winning he earns 1 euro and when losing he loses 1 euro. Let $ x_i $ be the man's fortune from playing this game $ i $ number of times. The starting fortune is $ x_0 $. We assume that the man gets a necessary loan if $ x_i < 0 $ such that the gaming can continue. The target is a fortune $ F $, meaning that the playing stops when $ x=F $ is reached.

a) Explain why $ x_i $ is a 1D random walk.

b) Modify one of the 1D random walk programs to simulate the average number of games it takes to reach the target fortune $ x=F $. This average must be computed by running a large number of random walks that start at $ x_0 $ and reach $ F $. Use $ x_0=10 $, $ F=100 $, and $ p=0.49 $ as example.

c) Suppose the average number of games to reach $ x=F $ is proportional to $ (F-x_0)^r $, where $ r $ is some exponent. Try to find $ r $ by experimenting with the program. The $ r $ value indicates how difficult it is to make a substantial fortune by playing this game. Note that the expected earning is negative when $ p < 0.5 $, but there is still a small probability for hitting $ x=F $.

Filename: game_as_walk1D.

Exercise 35: Simulate pollen movements as a 2D random walk

The motion of single particles can often be described as random walks. On a water surface, 1000 grains of pollen are placed in a single point. The movement of the pollen grains can be modeled by a random walk model, where for each second each grain will move a random distance, along a two-dimensional vector, whose two components are independently normally distributed with expectation 0 mm and standard deviation 0.05 mm.

a) Make a function that implements this kind of 2D random walk. Return an array with the position of each grain for each step.

b) Make a movie that shows the position of the pollen grains from 0 to 100 seconds.

c) Make a plot of the mean distance from the origin versus time. What do you see?

Filename: pollen.

Exercise 36: Make classes for 2D random walk

The purpose of this exercise is to reimplement the walk2D.py program from the section Basic implementation with the aid of classes.

a) Make a class Particle with the coordinates $ (x,y) $ and the time step number of a particle as data attributes. A method move moves the particle in one of the four directions and updates the $ (x,y) $ coordinates. Another class, Particles, holds a list of Particle objects and a plotstep parameter (as in walk2D.py). A method move moves all the particles one step, a method plot can make a plot of all particles, while a method moves performs a loop over time steps and calls move and plot in each step.

b) Equip the Particle and Particles classes with print functionality such that one can print out all particles in a nice way by saying print p (for a Particles instance p) or print self (inside a method).

Hint.

In __str__, apply the pformat function from the pprint module to the list of particles, and make sure that __repr__ just reuse __str__ in both classes so the output looks nice.

c) Make a test function that compares the first three positions of four particles with the corresponding results computed by the walk2D.py program. The seed of the random number generator must of course be fixed identically in the two programs.

d) Organize the complete code as a module such that the classes Particle and Particles can be reused in other programs. The test block should read the number of particles from the command line and perform a simulation.

e) Compare the efficiency of the class version against the vectorized version in walk2Dv.py, using the techniques in the document Evaluating the efficiency of Python programs [6].

f) The program developed above cannot be vectorized as long as we base the implementation on class Particle. However, if we remove that class and focus on class Particles, the latter can employ arrays for holding the positions of all particles and vectorized updates of these positions in the moves method. Use ideas from the walk2Dv.py program to make a new class Particles_vec which vectorizes Particles.

g) Verify the code against the walk2Dv.py program as explained in c). Automate the verification in a test function.

h) Write a Python function that measures the computational efficiency the vectorized class Particles_vec and the scalar class Particles.

Filename: walk2D_class.

Exercise 37: 2D random walk with walls; scalar version

Modify the walk2D.py or walk2Dc.py programs from Exercise 36: Make classes for 2D random walk so that the walkers cannot walk outside a rectangular area $ A = [x_L,x_H]\times [y_L,y_H] $. Do not move the particle if its new position is outside $ A $. Filename: walk2D_barrier.

Exercise 38: 2D random walk with walls; vectorized version

Modify the walk2Dv.py program so that the walkers cannot walk outside a rectangular area $ A = [x_L,x_H]\times [y_L,y_H] $.

Hint.

First perform the moves of one direction. Then test if new positions are outside $ A $. Such a test returns a boolean array that can be used as index in the position arrays to pick out the indices of the particles that have moved outside $ A $ and move them back to the relevant boundary of $ A $.

Filename: walk2Dv_barrier.

Exercise 39: Simulate mixing of gas molecules

Suppose we have a box with a wall dividing the box into two equally sized parts. In one part we have a gas where the molecules are uniformly distributed in a random fashion. At $ t=0 $ we remove the wall. The gas molecules will now move around and eventually fill the whole box.

This physical process can be simulated by a 2D random walk inside a fixed area $ A $ as introduced in Exercise 37: 2D random walk with walls; scalar version and Exercise 38: 2D random walk with walls; vectorized version (in reality the motion is three-dimensional, but we only simulate the two-dimensional part of it since we already have programs for doing this). Use the program from either Exercise 37: 2D random walk with walls; scalar version or Exercise 38: 2D random walk with walls; vectorized version to simulate the process for $ A=[0,1]\times[0,1] $. Initially, place 10000 particles at uniformly distributed random positions in $ [0,1/2]\times [0,1] $. Then start the random walk and visualize what happens. Simulate for a long time and make a hardcopy of the animation (an animated GIF file, for instance). Is the end result what you would expect? Filename: disorder1.

Remarks

Molecules tend to move randomly because of collisions and forces between molecules. We do not model collisions between particles in the random walk, but the nature of this walk, with random movements, simulates the effect of collisions. Therefore, the random walk can be used to model molecular motion in many simple cases. In particular, the random walk can be used to investigate how a quite ordered system, where one gas fills one half of a box, evolves through time to a more disordered system.

Exercise 40: Simulate slow mixing of gas molecules

Solve Exercise 39: Simulate mixing of gas molecules when the wall dividing the box is not completely removed, but instead has a small hole. Filename: disorder2.

Exercise 41: Guess beer brands

You are presented $ n $ glasses of beer, each containing a different brand. You are informed that there are $ m\geq n $ possible brands in total, and the names of all brands are given. For each glass, you can pay $ p $ euros to taste the beer, and if you guess the right brand, you get $ q\geq p $ euros back. Suppose you have done this before and experienced that you typically manage to guess the right brand $ T $ times out of 100, so that your probability of guessing the right brand is $ b=T/100 $.

Make a function simulate(m, n, p, q, b) for simulating the beer tasting process. Let the function return the amount of money earned and how many correct guesses ($ \leq n $) you made. Call simulate a large number of times and compute the average earnings and the probability of getting full score in the case $ m=n=4 $, $ p=3 $, $ q=6 $, and $ b=1/m $ (i.e., four glasses with four brands, completely random guessing, and a payback of twice as much as the cost). How much more can you earn from this game if your ability to guess the right brand is better, say $ b=1/2 $? Filename: simulate_beer_tasting.

Exercise 42: Simulate stock prices

A common mathematical model for the evolution of stock prices can be formulated as a difference equation $$ \begin{equation} x_n = x_{n-1} + \Delta t \mu x_{n-1} + \sigma x_{n-1}\sqrt{\Delta t}r_{n-1}, \tag{14} \end{equation} $$ where $ x_n $ is the stock price at time $ t_n $, $ \Delta t $ is the time interval between two time levels ($ \Delta t = t_{n}-t_{n-1} $), $ \mu $ is the growth rate of the stock price, $ \sigma $ is the volatility of the stock price, and $ r_0,\ldots,r_{n-1} $ are normally distributed random numbers with mean zero and unit standard deviation. An initial stock price $ x_0 $ must be prescribed together with the input data $ \mu $, $ \sigma $, and $ \Delta t $.

We can make a remark that (14) is a Forward Euler discretization of a stochastic differential equation for a continuous price function $ x(t) $: $$ \begin{equation*} \frac{dx}{dt} = \mu x + \sigma N(t),\end{equation*} $$ where $ N(t) $ is a so-called white noise random time series signal. Such equations play a central role in modeling of stock prices.

Make $ R $ realizations of (14) for $ n=0,\ldots,N $ for $ N=5000 $ steps over a time period of $ T=180 $ days with a step size $ \Delta t = T/N $. Filename: stock_prices.

Exercise 43: Compute with option prices in finance

In this exercise we are going to consider the pricing of so-called Asian options. An Asian option is a financial contract where the owner earns money when certain market conditions are satisfied.

The contract is specified by a strike price $ K $ and a maturity time $ T $. It is written on the average price of the underlying stock, and if this average is bigger than the strike $ K $, the owner of the option will earn the difference. If, on the other hand, the average becomes less, the owner receives nothing, and the option matures in the value zero. The average is calculated from the last trading price of the stock for each day.

From the theory of options in finance, the price of the Asian option will be the expected present value of the payoff. We assume the stock price dynamics given as, $$ \begin{equation} S(t+1)=(1+r)S(t)+\sigma S(t)\epsilon(t), \tag{15} \end{equation} $$ where $ r $ is the interest-rate, and $ \sigma $ is the volatility of the stock price. The time $ t $ is supposed to be measured in days, $ t=0,1,2,\ldots $, while $ \epsilon(t) $ are independent identically distributed normal random variables with mean zero and unit standard deviation. To find the option price, we must calculate the expectation $$ \begin{equation} p=(1+r)^{-T}\hbox{E}\left[\max\left(\frac1T\sum_{t=1}^{T}S(t)-K,0\right)\right] \tp \tag{16} \end{equation} $$ The price is thus given as the expected discounted payoff. We will use Monte Carlo simulations to estimate the expectation. Typically, $ r $ and $ \sigma $ can be set to $ r=0.0002 $ and $ \sigma=0.015 $. Assume further $ S(0)=100 $.

a) Make a function that simulates a path of $ S(t) $, that is, the function computes $ S(t) $ for $ t=1,\ldots,T $ for a given $ T $ based on the recursive definition in (15). The function should return the path as an array.

b) Create a function that finds the average of $ S(t) $ from $ t=1 $ to $ t=T $. Make another function that calculates the price of the Asian option based on $ N $ simulated averages. You may choose $ T=100 $ days and $ K=102 $.

c) Plot the price $ p $ as a function of $ N $. You may start with $ N=1000 $.

d) Plot the error in the price estimation as a function $ N $ (assume that the $ p $ value corresponding to the largest $ N $ value is the "right" price). Try to fit a curve of the form $ c/\sqrt{N} $ for some $ c $ to this error plot. The purpose is to show that the error is reduced as $ 1/\sqrt{N} $.

Filename: option_price.

Remarks

If you wonder where the values for $ r $ and $ \sigma $ come from, you will find the explanation in the following. A reasonable level for the yearly interest-rate is around 5 percent, which corresponds to a daily rate $ 0.05/250=0.0002 $. The number 250 is chosen because a stock exchange is on average open this amount of days for trading. The value for $ \sigma $ is calculated as the volatility of the stock price, corresponding to the standard deviation of the daily returns of the stock defined as $ (S(t+1)-S(t))/S(t) $. "Normally", the volatility is around 1.5 percent a day. Finally, there are theoretical reasons why we assume that the stock price dynamics is driven by $ r $, meaning that we consider the risk-neutral dynamics of the stock price when pricing options. There is an exciting theory explaining the appearance of $ r $ in the dynamics of the stock price. If we want to simulate a stock price dynamics mimicing what we see in the market, $ r $ in (15) must be substituted with $ \mu $, the expected return of the stock. Usually, $ \mu $ is higher than $ r $.

Exercise 44: Differentiate noise measurements

In a laboratory experiment waves are generated through the impact of a model slide into a wave tank. (The intention of the experiment is to model a future tsunami event in a fjord, generated by loose rocks that fall into the fjord.) At a certain location, the elevation of the surface, denoted by $ \eta $, is measured at discrete points in time using an ultra-sound wave gauge. The result is a time series of vertical positions of the water surface elevations in meter: $ \eta (t_0), \eta(t_1), \eta(t_2),\ldots,\eta(t_n) $. There are 300 observations per second, meaning that the time difference between two neighboring measurement values $ \eta(t_i) $ and $ \eta (t_{i+1}) $ is $ h=1/300 $ second.

a) Read the $ \eta $ values in the file gauge.dat into an array eta. Read $ h $ from the command line.

b) Plot eta versus the time values.

c) Compute the velocity $ v $ of the surface by the formula $$ v_i \approx (\eta_{i+1}-\eta_{i-1})/(2h), \quad i=1,\ldots,n-1\tp$$ Plot $ v $ versus time values in a separate plot.

d) Compute the acceleration $ a $ of the surface by the formula $$ a_i \approx (\eta_{i+1}-2\eta_i + \eta_{i-1})/h^2,\quad i=1,\ldots,n-1. $$ Plot $ a $ versus the time values in a separate plot.

e) If we have a noisy signal $ \eta_i $, where $ i=0,\ldots,n $ counts time levels, the noise can be reduced by computing a new signal where the value at a point is a weighted average of the values at that point and the neighboring points at each side. More precisely, given the signal $ \eta_i $, $ i=0,\ldots,n $, we compute a filtered (averaged) signal with values $ \eta^{(1)}_i $ by the formula $$ \begin{equation} \eta^{(1)}_i = \frac{1}{4}({\eta_{i+1} + 2\eta_i + \eta_{i-1}}),\quad i=1,\ldots,n-1,\ \eta^{(1)}_0=\eta_0,\ \eta^{(1)}_n=\eta_n\tp \tag{17} \end{equation} $$ Make a function filter that takes the $ \eta_i $ values in an array eta as input and returns the filtered $ \eta^{(1)}_i $ values in an array.

f) Let $ \eta^{(k)}_i $ be the signal arising by applying the filtered function $ k $ times to the same signal. Make a plot with curves $ \eta_i $ and the filtered $ \eta^{(k)}_i $ values for $ k=1, 10, 100 $. Make similar plots for the velocity and acceleration where these are made from both the original, measured $ \eta $ data and the filtered data. Discuss the results.

Filename: labstunami.

Exercise 45: Differentiate noisy signals

The purpose of this exercise is to look into numerical differentiation of time series signals that contain measurement errors. This insight might be helpful when analyzing the noise in real data from a laboratory experiment in Exercise 44: Differentiate noise measurements.

a) Compute a signal $$ \begin{equation*} \bar\eta_i = A\sin ({2\pi\over T} t_i),\quad t_i=i{T\over 40},\ i=0,\ldots,200\tp\end{equation*} $$ Display $ \bar\eta_i $ versus time $ t_i $ in a plot. Choose $ A=1 $ and $ T=2\pi $. Store the $ \bar\eta $ values in an array etabar.

b) Compute a signal with random noise $ E_i $, $$ \begin{equation*} \eta_i = \bar\eta_i + E_i,\end{equation*} $$ $ E_i $ is drawn from the normal distribution with mean zero and standard deviation $ \sigma = 0.04A $. Plot this $ \eta_i $ signal as circles in the same plot as $ \bar\eta_i $. Store the $ E_i $ in an array E for later use.

c) Compute the first derivative of $ \bar\eta_i $ by the formula $$ \begin{equation*} {\bar\eta_{i+1}-\bar\eta_{i-1}\over 2h}, \quad i=1,\ldots,n-1, \end{equation*} $$ and store the values in an array detabar. Display the graph.

d) Compute the first derivative of the error term by the formula $$ \begin{equation*} {E_{i+1}-E_{i-1}\over 2h}, \quad i=1,\ldots,n-1, \end{equation*} $$ and store the values in an array dE. Calculate the mean and the standard deviation of dE.

e) Plot detabar and detabar + dE. Use the result of the standard deviation calculations to explain the qualitative features of the graphs.

f) The second derivative of a time signal $ \eta_i $ can be computed by $$ \begin{equation*} {\eta_{i+1}-2\eta_i + \eta_{i-1}\over h^2}, \quad i=1,\ldots,n-1\tp \end{equation*} $$ Use this formula on the etabar data and save the result in d2etabar. Also apply the formula to the E data and save the result in d2E. Plot d2etabar and d2etabar + d2E. Compute the standard deviation of d2E and compare with the standard deviation of dE and E. Discuss the plot in light of these standard deviations.

Filename: sine_noise.

Exercise 46: Model noise in a time signal

We assume that the measured data can be modeled as a smooth time signal $ \bar\eta (t) $ plus a random variation $ E(t) $. Computing the velocity of $ \eta = \bar\eta + E $ results in a smooth velocity from the $ \bar\eta $ term and a noisy signal from the $ E $ term.

a) We can estimate the level of noise in the first derivative of $ E $ as follows. The random numbers $ E(t_i) $ are assumed to be independent and normally distributed with mean zero and standard deviation $ \sigma $. It can then be shown that $$ \begin{equation*} {E_{i+1}-E_{i-1}\over 2h}\end{equation*} $$ produces numbers that come from a normal distribution with mean zero and standard deviation $ 2^{-1/2}h^{-1}\sigma $. How much is the original noise, reflected by $ \sigma $, magnified when we use this numerical approximation of the velocity?

b) The fraction $$ \begin{equation*}{E_{i+1}-2E_i + E_{i-1}\over h^2}\end{equation*} $$ will also generate numbers from a normal distribution with mean zero, but this time with standard deviation $ 2h^{-2}\sigma $. Find out how much the noise is magnified in the computed acceleration signal.

c) The numbers in the gauge.dat file in Exercise 44: Differentiate noise measurements are given with 5 digits. This is no certain indication of the accuracy of the measurements, but as a test we may assume $ \sigma $ is of the order $ 10^{-4} $. Check if the visual results for the velocity and acceleration are consistent with the standard deviation of the noise in these signals as modeled above.

Exercise 47: Speed up Markov chain mutation

The functions transition and mutate_via_markov_chain from the section Random mutations of genes were made for being easy to read and understand. Upon closer inspection, we realize that the transition function constructs the interval_limits every time a random transition is to be computed, and we want to run a large number of transitions. By merging the two functions, pre-computing interval limits for each from_base, and adding a loop over N mutations, one can reduce the computation of interval limits to a minimum. Perform such an efficiency enhancement. Measure the CPU time of this new function versus the mutate_via_markov_chain function for 1 million mutations. Filename: markov_chain_mutation2.

References

H. P. Langtangen. Migrating Python to compiled code, \emphhttp://hplgit.github.io/primer.html/doc/pub/cython, http://hplgit.github.io/primer.html/doc/pub/cython.
H. P. Langtangen. Strings and dictionaries, \emphhttp://hplgit.github.io/primer.html/doc/pub/files, http://hplgit.github.io/primer.html/doc/pub/files.
H. P. Langtangen. Sequences and difference equations, \emphhttp://hplgit.github.io/primer.html/doc/pub/diffeq, http://hplgit.github.io/primer.html/doc/pub/diffeq.
H. P. Langtangen. Array computing and curve plotting, \emphhttp://hplgit.github.io/primer.html/doc/pub/plot, http://hplgit.github.io/primer.html/doc/pub/plot.
H. P. Langtangen. Variable number of function arguments in Python, \emphhttp://hplgit.github.io/primer.html/doc/pub/varargs, http://hplgit.github.io/primer.html/doc/pub/varargs.
H. P. Langtangen. Evaluating the efficiency of Python programs, \emphhttp://hplgit.github.io/primer.html/doc/pub/timing, http://hplgit.github.io/primer.html/doc/pub/timing.

Exercises

Exercise 1: Flip a coin times

Exercise 2: Compute a probability

Exercise 3: Choose random colors

Exercise 4: Draw balls from a hat

Exercise 5: Computing probabilities of rolling dice

Exercise 6: Estimate the probability in a dice game

Exercise 7: Compute the probability of hands of cards

Exercise 8: Decide if a dice game is fair

Exercise 9: Adjust a game to make it fair

Exercise 10: Make a test function for Monte Carlo simulation

Exercise 11: Generalize a game

Exercise 12: Compare two playing strategies

Exercise 13: Investigate strategies in a game

Exercise 14: Investigate the winning chances of some games

Exercise 15: Compute probabilities of throwing two dice

Exercise 16: Vectorize flipping a coin

Exercise 17: Vectorize a probablility computation

Exercise 18: Throw dice and compute a small probability

Exercise 19: Is democracy reliable as a decision maker?

Exercise 20: Difference equation for random numbers

Exercise 21: Make a class for drawing balls from a hat

Exercise 22: Independent versus dependent random numbers

Exercise 23: Compute the probability of flipping a coin

Exercise 24: Simulate binomial experiments

Exercise 25: Simulate a poker game

Exercise 26: Estimate growth in a simulation model

Exercise 27: Investigate guessing strategies

Exercise 28: Vectorize a dice game

Exercise 29: Compute \( \pi \) by a Monte Carlo method

Exercise 30: Compute \( \pi \) by a Monte Carlo method

Exercise 31: Compute \( \pi \) by a random sum

Exercise 32: 1D random walk with drift

Exercise 33: 1D random walk until a point is hit

Exercise 34: Simulate making a fortune from gaming

Exercise 35: Simulate pollen movements as a 2D random walk

Exercise 36: Make classes for 2D random walk

Exercise 37: 2D random walk with walls; scalar version

Exercise 38: 2D random walk with walls; vectorized version

Exercise 39: Simulate mixing of gas molecules

Remarks

Exercise 40: Simulate slow mixing of gas molecules

Exercise 41: Guess beer brands

Exercise 42: Simulate stock prices

Exercise 43: Compute with option prices in finance

Remarks

Exercise 44: Differentiate noise measurements

Exercise 45: Differentiate noisy signals

Exercise 46: Model noise in a time signal

Exercise 47: Speed up Markov chain mutation

References