Steven R. Dunbar
Department of Mathematics
203 Avery Hall
University of Nebraska-Lincoln
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466

Topics in
Probability Theory and Stochastic Processes
Steven R. Dunbar

__________________________________________________________________________

The Hitting Time Theorem

_______________________________________________________________________

Note: These pages are prepared with MathJax. MathJax is an open source JavaScript display engine for mathematics that works in all browsers. See http://mathjax.org for details on supported browsers, accessibility, copy-and-paste, and other features.

_______________________________________________________________________________________________

Rating

Rating

Mathematicians Only: prolonged scenes of intense rigor.

_______________________________________________________________________________________________

Section Starter Question

Section Starter Question

Suppose a simple random walk of 6 steps with Y i = ±1 has 4 steps +1 and 2 steps 1. Explicitly enumerate all the possible walks. In how many ways does the walk stay less than 2 throughout the entire walk until step 6?

_______________________________________________________________________________________________

Key Concepts

Key Concepts

  1. The Hitting Time Theorem expresses a conditional probability. The Hitting Time Theorem says that under reasonable, or even relatively weak conditions on the steps, the conditional probability that an n-step random walk starting at 0, given that it ends at k > 0, first hits k at time n is kn.
  2. If the Y i are i. i. d. r. v. or the joint probability distribution of (Y 1,,Y n) is invariant under rotations or exchangeable, then
    Tn = k,Tt < k, for all t < n = k n Tn = k

  3. The Hitting Time Theorem is equivalent to the Ballot Theorem.
  4. The conditions of invariance under rotations and exchangeability are weaker than the traditional requirement that Y i be independent and identically distributed.

__________________________________________________________________________

Vocabulary

Vocabulary

  1. The Hitting Time Theorem shows the conditional probability that an n-step random walk with steps (Y 1,,Y n) having a joint probability distribution with reasonable conditions starting at 0, given that it ends at height k > 0, first hits k at time n is kn.
  2. If the joint probability distribution of (Y 1,,Y n) is the same as the joint probability distribution of the rotated sequence
    (Y r+1,,Y n,Y 1,,Y r)

    then we say the distribution is invariant under rotations.

  3. A random vector (Y 1,,Y n) is exchangeable if the joint probability distribution is invariant for any permutation of (Y 1,,Y n).

__________________________________________________________________________

Mathematical Ideas

Mathematical Ideas

Background and History

The Ballot Theorem can be interpreted as the probability that, during the vote counting in an election between two candidates, the winning candidate is always ahead of the other. Suppose that in an election with n votes, candidate A receives a votes and candidate B receives b votes, where a b = k for some positive integer k. Assume that all ballot permutations are equally likely during the counting. By dividing by the total number of ballot permutations in the counting of the specialized ballot theorem, the probability is kn, see The Ballot Theorem and the Reflection Principle.

The problem can be restated as the probability that an n-step random walk taking independent and identically distributed ±1 steps stays positive after time 0, given that it ends at height k > 0, is kn. If each step has probability 12, this probability can be derived easily using the reflection principle, an argument appearing in many textbooks. In fact, it is possible to prove the same result under weaker hypotheses, [?], [?].

The Hitting Time Theorem says the conditional probability that an n-step random walk starting at 0, given that it ends at k > 0, first hits k at time n is kn.

Remark. Note that:

In 1949 Otter gave the first proof of the Hitting Time Theorem for k = 1. Kemperman in 1961 and Dwass in 1969 gave proofs for k 2 using generating functions and the Lagrange inversion formula. For simple Bernoulli random walks making steps of size ±1, a proof using the Reflection Principle is in [?]. See [?] for references and discussion.

The Hitting Time Theorem for Independent Identically Distributed Random Variables

Fix n 1. Let (Y 1,,Y n) be a sequence of random variables Y i taking values in {,2,1, 0, 1}. Note that this is more general than, but includes, the standard random walk where the random variables take values in {1, 1}. Define the walk T = (T0,,Tn) as the usual piecewise linear function defined by the values Ti = j=0iY j at the integers i = 0, 1, 2,,n. Note that because the only positive value assumed by the steps Y i is 1, for the random walk T to gain a height k, it has to pass through all k 1 intermediate integer values.

Theorem 1 (Hitting Time Theorem). Let k 0. For a random walk starting at 0 with i. i. d. integer-valued steps Y i with Y i 1 for i = 1, 2, 3 then

Tn = k,Tt < k, for all t < n = k n Tn = k.

Proof. The proof proceeds by induction on n 1. When n = 1, both sides are 0 when k > 1 and k = 0, and are equal to Y 1 = 1 when k = 1. This is the base case of the induction.

For the induction step, take n 2 and consider the case k = 0. In this case, the left side requires Tt < 0 for all t < n, yet T0 = 0 so the event is empty and the probability is 0. The right side is also equal to 0 so the case k = 0 is satisfied. Thus, we may assume that k 1.

So take n 2 and consider the case k > 0. Condition on the first step to obtain

Tn = k,Tt < k for all t < n = s=1 T n = k,Tt < k for all t < n|Y 1 = s Y 1 = s.

For a given s and m = 0,,n 1, let Tm(s) = T m+1 s. Because the steps are independent, identically distributed random variables, by the random walk Markov property, with T0 = 0 and conditionally on Y 1 = s the random walk {Tm(s)} m=0n1 has the same distribution as the random walk path {Tm}m=0n1.


Random walk translation.

Figure 1: Illustration of {Tm}m=0n1 and {Tm(s)} m=0n1 for s = 1, shown in red.

That is,

Tn = k,Tt < k for all t < n|Y 1 = s = Tn1(s) = k s,T t(s) < k s for all t < n 1 = k s n 1 Tn1(s) = k s

where in the last equality we have used the induction hypothesis which is allowed since n 1 1 and k 1 with s 1 so k s 0. Because the steps are independent, identically distributed random variables Tn1(s) = k s = T n = k|Y 1 = s, (see Figure 1) substitute into the summation to obtain

Tn = k,Tt < k for all t < n = s=1 k s n 1 Tn = k|Y 1 = s Y 1 = s.

Temporarily ignore the n 1 in the denominator and consider the summation

s=1(k s) T n = k|Y 1 = s Y 1 = s.

By the properties of conditional expectation

Tn = k|Y 1 = s Y 1 = s = Tn = k,Y 1 = s = Y 1 = s|Tn = k Tn = k.(1)

Thus,

s=1(k s) T n = k|Y 1 = s Y 1 = s = s=1(k s) Y 1 = s|Tn = k Tn = k = Tn = k (k 𝔼 Y 1|Tn = k).

By the assumption of independent, identically distributed steps, the conditional expectation 𝔼 Y i|Tn = k is independent of i so that

𝔼 Y i|Tn = k = 1 n i=1n𝔼 Y i|Tn = k = 1 n𝔼 i=1nY i|Tn = k = k n,

since i=1nY i = Tn = k when Tn = k. Recalling the temporarily ignored factor of 1 n1 we arrive at

Tn = k,Tt < k, for all t < n = 1 n 1 k k n Tn = k = k n Tn = k

This completes the proof by induction. □

Example. See Figure 2 for an example.


Example of Hitting Time Theorem

Figure 2: Example of the Hitting Time Theorem with n = 6, k = 2, a = 4, b = 2.

Example Application of the Hitting Time Theorem

The following example is adapted from [?] and [?]. The original problem is:

The Kentucky Derby is on Saturday, and a field of 20 horses is slated to run “the fastest two minutes in sports” in pursuit of the right to be draped with a blanket of roses. But let’s consider, instead, the Lucky Derby, where things are a little more bizarre:

The bugle sounds, and 20 horses make their way to the starting gate for the first annual Lucky Derby. These horses, all trained at the mysterious Riddler Stables, are special. Each second, every Riddler-trained horse takes one step. Each step is exactly one meter long. But what these horses exhibit in precision, they lack in sense of direction. Most of the time, their steps are forward (toward the finish line) but the rest of the time they are backward (away from the finish line). As an avid fan of the Lucky Derby, you’ve done exhaustive research on these 20 competitors. You know that Horse One goes forward 52 percent of the time, Horse Two 54 percent of the time, Horse Three 56 percent, and so on, up to the favorite filly, Horse Twenty, who steps forward 90 percent of the time. The horses’ steps are taken independently of one another, and the finish line is 200 meters from the starting gate.

Handicap this race and place your bets! In other words, what are the odds (a percentage is fine) that each horse wins?

It is easy to create a simulation for this scenario. A sample script is in the scripts section below. The simulation of 2000 races suggests that the probability of horse 20 winning is about 70.48% and the probability is about 22.86%.

Consider a particular horse. For each step, suppose it moves forward with probability p (in this example p > 12) or backward with probability 1 p. Recall that reaching an even-numbered position 2k (such as 200 in the problem statement) can only occur in an even number of steps, say 2n. Then the horse must take n + k forward steps and n k backward steps. The horse starts at position 0 and we would like to know the probability that it will reach position 2k (where specifically in the example k = 100 with 2k = 200) in exactly 2n steps. For convenience, adopt the following notation:

This probability mass distribution does not seem to have a common name, but it is still possible to compute the statistics of this distribution.

Theorem 2.

  1. P(2n; 2k,p) is a probability mass distribution, that is
    n=kP(2n; 2k,p) = 1.

  2. 𝔼 H = 2k 2p1, that is
    n=k(2n)P(2n; 2k,p) = k 2p 1.

  3. Var H = 8kp(1p) (2p1)3 , that is
    n=k(2n k 2p 1)2P(2n; 2k,p) = 8kp(1 p) (2p 1)3

Proof. Starting with

n=kk n 2n n + kpn+k(1 p)nk

change the variables n = j + k to rewrite the summation as

p2k j=0 k j + k 2(j + k) j pj(1 p)j.

This series resembling a binomial series expansion suggests that the series might have a closed form. In fact, using Maple to find a closed form solution, obtain

j=0 k j + k 2(j + k) j xj = 22k (1 + 1 4x)2k.

In turn the closed form for the power series suggests that it might be possible to express the moment-generating function in closed form. Again using Maple to find a closed form expression

FH(t) = 𝔼 etH = n=kk n 2n n + kpn+k(1 p)nke2nt = (2p)2ke2kt (1 + 1 4e 2t + 4e 2t p2)2k.

Then, once more using Maple, it is each to check that

FH(0) = n=kP(2n; 2k,p) = 1

and

𝔼 H = FH(0) = 2k 2p 1.

The calculation of the variance is left as an exercise. □

For the horse race example, the mean number and standard deviation of steps required for each horse to reach position 200 are in Table 1.


Horsep Mean Variance StdDev





1 0.525000. 3120000. 1766.3522
2 0.542500. 388125. 622.99679
3 0.561666.6667114074.07337.74853
4 0.581250. 47578.125218.12410
5 0.6 1000. 24000. 154.91933
6 0.62833.3333313634.259116.76583
7 0.64714.285718396.501591.632426
8 0.66625. 5478.515674.016995
9 0.68555.555563731.138561.083046
10 0.7 500. 2625. 51.234754
11 0.72454.545451893.313343.512220
12 0.74416.666671391.782437.306600
13 0.76384.615381037.778832.214574
14 0.78357.14286781.7055427.958997
15 0.8 333.33333592.5925924.343225
16 0.82312.5 450.4394521.223559
17 0.84294.11765341.9499318.491888
18 0.86277.77778258.0589816.064214
19 0.88263.15789192.4478813.872559
20 0.9 250. 140.625 11.858541

Table 1: Statistics for the hitting time distribution for each horse in the race.

The probability distributions seem to be approximately normal in shape, see Figure 3. This is reasonable since the hitting time to level 2k is the sum of 2k random one-level hitting times, first the hitting time from the initial point 0 to first reach 1, then the independent hitting time to first reach 2 from 1, and so on. The one-level hitting times are independent since the individual steps of the random walk are independent. Thus, as the sum of many independent random variables, the distribution of the hitting time to 2k is approximately normal.


Representative probability mass distributions

Figure 3: Representative probability mass distributions for horses 20, 16, 12, and 8.

Suppose each horse i finishes in hi steps. Computing the probability that specific horse j wins the race amounts to finding the probability that hi > hj for all ij. Since the probability distributions are approximately normal, approximate this probability with:

hi > hj,ij h j = x ij hi > xdx = 1 σjϕ x μj σj ij 1 Φ x μi σi dx.

The probability calculation uses the usual notation: ϕ is the standard normal probability density (pdf), Φ is the cumulative distribution (cdf) of the standard normal distribution, and μk and σk are the means and standard deviations computed in Table 1.

Analytic integration of the probability is impractical, and even the numerical integration of

Hj(x) = 1 σjϕ x μj σj ij 1 Φ x μi σi

is challenging. Taking H20(x) as an example, the product term

i20 1 Φ x μi σi

is approximately 1 for x < μ19 3σ19 221.5402. Since each factor is less than 1 and 1 Φ((x μ19)σ19) 0 for x > μ19 + 3σ19 304.7756, the product decreases to approximately 0 for x > 304.7756. The other factor 1 σ20ϕ xμ20 σ20 0 outside the interval (μ20 3σ20,μ20 + 3σ20) (214.4244, 285.5756) and has a maximum value of ϕ(0) 0.03364 at x = 250. Therefore, H20(x) is approximately 0 outside the interval (214.4244, 285.5756) and reaches a maximum approximately H20(250) 0.02633. Computing the product of so many small magnitude factors could lead to numerical underflow, so instead the exponential of the sum and difference of the respective logarithms is computed.

The low variation with an aspect ratio of 1 σ20(ϕ(0))(6σ20) 12115 makes even adaptive integration challenging. The default integration routine of R gives a nonsensical answer of 1.311008. With the additional R library pracma of advanced numerical analysis routines, the quad function using adaptive Simpson quadrature yields 0.7086704 and the quadl function using adaptive Lobatto quadrature yields the very similar 0.7078003. Numerical integration with Scientific Python using either general purpose integration or Romberg integration gives similar results.

Therefore we estimate horse 20 has about a 71% chance of winning. Repeating the analysis with the necessary changes, horse 19 has about a 21% chance of winning. One of the other 18 horses winning has about an 8% chance.

The Hitting Time Theorem Under Rotation Invariance

Fix n 1. Let (Y 1,,Y n) be a sequence of random variables Y i taking values in {,2,1, 0, 1}. Define the walk T = (T0,,Tn) as the usual piecewise linear function defined by the values Ti = j=0iY j at the integers i = 0, 1, 2,,n. Note also that because the only positive value assumed by the steps Y i is 1, for the random walk T to gain a height k, it has to pass through all k 1 intermediate integer values. Given r with 0 < r n, define the rotation of T by r as the walk T(r) = (T 0(r),,T n(r)) corresponding to the rotated sequence (Y r+1,,Y n,Y 1,,Y r). Another way to represent the rotation of the sequence is

Tt(r) = Tt+r Tr 0 t n r; Tn Tr + Tt+rnn r < t n.

Note that Tn(r) = T n + Tn+rn Tr = Tn and the entire walk rotated by n is the original walk, T(n) = T. We say that T(r) peaks at n if Tt(r) < T n(r) = T n for all t < n. See Table 3 for several examples.


j 1 2 3 4 5 6 7 8 9 10











Y j -1 1 -1-1 1 1 -1-1 1 1
Tj -1 0 -1-2-1 0 -1-2-1 0











Y (5) 1 -1-1 1 1 -1 1 -1-1 1
Tj(5) 1 0 -1 0 1 0 1 0 -1 0

Table 2: Example of a rotation of a Bernoulli random walk with Y i = ±1, n = 10, and r = 5.

Lemma 3. If Tn = k 1, then exactly k rotations of T peak at n.

Proof. Set M = max{T0,Tn}. If Tr Ts for some r and s such that 0 s < r n, then Tnr+s(r) = T n + Ts Tr Tn, so T(r) can only peak at n if r = ri = min(t > 0|Tt = i) for some i satisfying 0 = T0 < i M. That is, T(r) can only peak at n if there are no indices prior to r with values greater than Tr. Since the values of Tj has to pass through all intermediate integer values, r must be ri = min(t > 0|Tt = i). The property of passing through all intermediate values implies that 0 < r1 < r2 < < rM. Therefore, for 1 i M, max{Tt(ri)|0 t n r i} = max{Tt : ri t n} Tri = M i, and max{Ttri|n r t < n} = T n + max{Tt|0 t < ri} Tri = Tn 1. It follows that Tri peaks at n if and only if M Tn < i M. □

Example. This example is a specific illustration of the proof of the lemma with n = 10 and the random walk given in Table 3.


j 1 2 3 4 5 6 7 8 9 10











Y j 1 1 -1 1 -1 1 -1 1 1 1
Tj 1 2 1 2 1 2 1 2 3 4











Y j(1) 1 -1 1 -1 1 -1 1 1 1 1
Tj(1) 1 0 1 0 1 0 1 2 3 4











Y j(2) -1 1 -1 1 -1 1 1 1 1 1
Tj(2) -1 0 -1 0 -1 0 1 2 3 4











Y j(3) 1 -1 1 -1 1 1 1 1 1 -1
Tj(3) 1 0 1 0 1 2 3 4 5 4











Y j(4) -1 1 -1 1 1 1 1 1 -1 1
Tj(4) -1 0 -1 0 1 2 3 4 3 4











Y j(5) 1 -1 1 1 1 1 1 -1 1 -1
Tj(5) 1 0 1 2 3 4 5 4 5 4











Y j(6) -1 1 1 1 1 1 -1 1 -1 1
Tj(6) -1 0 1 2 3 4 3 4 3 4











Y j(7) 1 1 1 1 1 -1 1 -1 1 -1
Tj(7) 1 2 3 4 5 4 5 4 5 4











Y j(8) 1 1 1 1 -1 1 -1 1 -1 1
Tj(8) 1 2 3 4 3 4 3 4 3 4











Y j(9) 1 1 1 -1 1 -1 1 -1 1 1
Tj(9) 1 2 3 2 3 2 3 2 3 4

Table 3: Example of the lemma with n = 10, counting the rotations which peak at 10.

For the example, M = max{T0,T10} = 4. For the example 1 = T5 T4 = 2 for r = 5 and s = 4 such that 0 4 = s < r = 5 n = 10. Now consider T105+4(5) = T 9(5) = 5 = T 10 + T4 T5 = 4 + 2 1 = 5 4. Therefore, T(5) cannot peak at n = 10. Table 2 also illustrates that T(5) cannot peak at n = 10.

Now consider i = 2 satisfying 0 = T0 < i = 2 M = 4. Then 2 = r2 = min(t > 0|Tt = 2). Note that T(2) peaks at n = 10.

The definition of ri is ri = min(t > 0|Tt = i). For the example, 0 < 1 = r1 < 2 = r2 < 9 = r3 < 10 = r4. The next point in the lemma is that for 1 i M, max{Tt(ri)|0 t n r i} = max{Tt : ri t n} Tri = M i. For this example, Table 3 gives all values for comparison, recalling that T(10) = T.

rimax{Tt(ri)|0 t n r i}max{Tt : ri t n} TriM i
1 3 4 1 = 34 1 = 3
2 2 4 2 = 2 4 2 = 2
9 1 4 3 = 14 3 = 1
10 0 4 4 = 0 4 4 = 0

Finally, the last point in the lemma is that for 1 i M, max{Tt(ri) : n r i t < n} = Tn + max{Tt : 0 t ri} Tri = Tn i. For this example, Table 3 gives all values for comparison, recalling that T(10) = T.

Definition. Call two walks equivalent if one is a rotation of the other. This defines an equivalence relation on the set of n-step walks.

Theorem 4 (Hitting Time Theorem). If the joint probability distribution of (Y 1,,Y n) is invariant under rotations, then

Tn = k,Tt < k for all t < n = k n Tn = k

Proof. Consider a single equivalence class, and let x = (x1,,xn) be the sequence of increments of one member of the class. Every walk in the class ends at the same height k, and the class contains either n walks, or np walks for some divisor p of n if (x1,,xn) happens to be periodic. In either case, the lemma implies that if k 1 and the joint probability distribution of Y = (Y 1,,Y n) is invariant under rotations, then

Tn = k,Tt < k for all t < n, with Y  equivalent to x = k n Tn = k,Y  equivalent to x.

Choose one sequence of increments x from each equivalence class and sum over them to obtain the result. □

Remark. The condition that the joint probability distribution of (Y 1,,Y n) is invariant under rotations here is weaker than the traditional requirement that Y i be independent and identically distributed.

Remark. This proof is another version of the Cycle Lemma from the proof of the Ballot Theorem.

The Hitting Time Theorem Under Exchangeability

Definition. A random vector (Y 1,,Y n) is exchangeable if the joint probability distribution of (Y 1,,Y n) is invariant for any permutation of (Y 1,,Y n).

Theorem 5 (Hitting Time Theorem). If the joint probability distribution of (Y 1,,Y n) is exchangeable, then

Tn = k,Tt < k, for all t < n = k n Tn = k.

Remark. The proof is not given here, see [?]. The idea of the proof is that the joint probability distribution of (Y 1,,Y n) conditioned on i=1nY i = k is still exchangeable. This implies the conditional expectation 𝔼 Y i|Tn = 0 is independent of i. Then the conditional expectation step still holds, allowing the completion of the induction argument.

Sources

The history and background is adapted from [?]. The proof for i. i. d. random variables is adapted from [?]. The proof under rotation invariance is adapted from the article by Kager, [?]. The comments about exchangeability are adapted from [?].

_______________________________________________________________________________________________

Algorithms, Scripts, Simulations

Algorithms, Scripts, Simulations

Algorithm

Enumerating all walks

Set the number a of upsteps and the number b of downsteps. Then the total number of steps is the length l = a + b. There are then a+b ab places to have the downsteps. In a sequence of nested loops systematically place the downsteps for B in an (a + b) × a+b ab matrix of ones. Then cumulatively sum this matrix down the columns to get all possible walks and save the matrix for later plotting or analysis. Note that this algorithm does not scale well because of the inherent growth of the binomial coefficients.

Simulating the Lucky Derby Example

Set all parameters for the simulations, using a sufficient length of the race for most of the fastest horses to complete the race. Simulate all races in a vectorized manner, using the usual technique of creating uniformly distributed random variates, comparing row-wise to the Bernoulli probability, and cumulatively summing. Then using vectorized match functions, extract the index of the winning horse (breaking ties with minimum index), and then counting the number of times each horse wins with the histogram counter.

Computing the Win Probability in the Lucky Derby Example

Using a normal approximation to the first hitting time distribution for a Bernoulli random walk, construct the probability density for the 20th walk winning by multiplying the density and the survival functions, assuming independence. Then using different numerical quadrature routines, compute the approximate total probability of the 20th walker winning.

Scripts

Scripts

Octave

Octave script for plotting walks.

1k = 1; 
2a = 4; 
3b = 2; 
4 
5steps = ones( a+b, nchoosek(a+b,a-b) ); 
6 
7# Systematically place the downsteps for B 
8# using a-b nested loops, here a-b = 2 loops 
9c = 0; 
10for i = 1:(a+b)-1 
11  for j= (i+1):(a+b) 
12    c = c + 1; 
13    steps(i,c) = -k; 
14    steps(j,c) = -k; 
15  endfor; 
16endfor; 
17 
18walk = cumsum(steps); 
19 
20# Put the walk in a matrix whose 
21# first column is x-coordinates 
22# and first row is 0 to start at origin 
23plot_walk = zeros((a+b)+1, nchoosek(a+b,a-b) + 1); 
24plot_walk(2:(a+b)+1, 2:nchoosek(a+b,a-b) + 1) = \ 
25walk; 
26plot_walk(:,1) = 0:(a+b); 
27 
28save plot_walk.dat plot_walk
6cAp1x1-1800031:

GnuPlot script for uniform partitions.

1set term pdf size 5,5.25 linewidth 2 
2set multiplot layout 5,3 
3# 
4set pointsize 0.5 
5set border 0      # no border ( mathematical, not eng or physics style) 
6set xtics nomirror    # no tic marks on top 
7set ytics nomirror    # ... or right 
8set xzeroaxis linetype -1 
9set xtics axis 
10set yzeroaxis linetype -1 
11unset key 
12# ls 7 is black filled circles, with black lines 
13do for [i=2:16] { 
14    plot plot_walk.dat using 1:i:xticlabel(""):yticlabel("") \ 
15             with linespoints ls 7 
16} 
17 
18unset multiplot
6cAp2x1-1800019:
R

R script for simulating the horse race example..

1speeds <- seq(from=0.52, to=0.90, by=0.02) 
2 
3nRaces <- 2000 
4nHorses <- 20 
5lengthRace <- 600 
6distance <- 200 
7 
8randVals <- array( runif(nHorses * lengthRace * nRaces), c(nHorses, 
9lengthRace, nRaces)) 
10stepsSim <- 2*(  randVals <= speeds ) - 1 
11racesSim <- apply( stepsSim, c(1,3), cumsum) 
12whenFinish <- apply( racesSim, c(1,3), ( function(x) match(distance, x))) 
13winnerTime <- apply(whenFinish, 2, function(x) min(which(!is.na(x)))) 
14winner <- whenFinish[ cbind(winnerTime, 1:nRaces) ] 
15hist(winner, breaks=seq(from=0.5, to=nHorses+0.5, by=1), plot=FALSE)
6cAp3x1-1800018:
R

R script for winning probability for the horse race example..

1distance <- 200 
2p <- seq(0.52, 0.90, length=20) 
3mu <- distance/(2*p-1) 
4sigma <- sqrt( (4*distance*p*(1-p))/((2*p-1)^3) ) 
5 
6h20 <- function(x) { 
7    (1/sigma[20]) * 
8        exp( dnorm((x-mu[20])/sigma[20], 
9                   log=TRUE 
10                  ) + 
11             sum( pnorm((x - mu)/sigma, 
12                        lower.tail=FALSE, log.p = TRUE 
13                        ) 
14                ) - 
15             pnorm( (x - mu[20])/sigma[20], 
16                   lower.tail=FALSE, log.p = TRUE 
17                  ) 
18           ) 
19} 
20 
21win20Integrate <- integrate(h20, 
22                          mu[20]-3*sigma[20], 
23                          mu[20]+3*sigma[20] 
24                          ) 
25cat("integrate: ", win20Integrate$value, "\n") 
26 
27library("pracma") 
28win20Quad <- quad(h20, 
29                mu[20]-3*sigma[20], 
30                mu[20]+3*sigma[20] 
31                ) 
32cat("quad: ", win20Quad, "\n") 
33 
34win20QuadL <- quadl(h20, 
35                  mu[20]-3*sigma[20], 
36                  mu[20]+3*sigma[20]) 
37cat("quadl: ", win20QuadL, "\n")
6cAp4x1-1800040:
SciPy

Scientific Python script for winning probability for the horse race example..

1import scipy 
2 
3distance = 200 
4p = scipy.linspace(0.52, 0.90, 20) 
5mu = distance / (2 * p -1) 
6sigma = scipy.sqrt( (4 * distance * p * (1-p)) / ((2 * p - 1) ** 3)) 
7 
8from scipy.stats import norm 
9 
10def h20(x): 
11    p = ( (1 / sigma[20 - 1]) * 
12        scipy.exp( scipy.stats.norm.logpdf((x-mu[20-1]) / sigma[20-1] ) + 
13             scipy.sum( scipy.stats.norm.logsf((x - mu) / sigma )) - 
14             scipy.stats.norm.logsf( (x - mu[20 - 1]) / sigma[20 - 1] ) 
15           ) 
16        ) 
17    return p 
18 
19import scipy.integrate as integrate 
20 
21win20Quad = integrate.quad(h20, 
22                           mu[20 - 1] - 3 * sigma[20 - 1], 
23                           mu[20 - 1] + 3 * sigma[20 - 1]) 
24print("quad", win20Quad[0]) 
25 
26win20Romberg = integrate.romberg(h20, 
27                                 mu[20 - 1] - 3 * sigma[20 - 1], 
28                                 mu[20 - 1] + 3 * sigma[20 - 1]) 
29print("Romberg", win20Quad[0])
6cAp5x1-1800032:

__________________________________________________________________________

Problems to Work

Problems to Work for Understanding

  1. Use the Hitting Time Theorem to prove the Ballot Theorem.
  2. Using the Reflection Principle, prove the probability that an n-step random walk taking independent and identically distributed steps ±1 with probability 12 stays positive after time 0, given that it ends at height k > 0, is kn.
  3. Write a paragraph explaining in detail the differences and connections among the Hitting Time Theorem, the Positive Walks Theorem, and the Gambler’s Ruin. Provide specific example using numerical values for the parameters.
  4. Consider the case of a 7 step walk with Y i = ±1 with 4 steps +1 and 3 steps 1. Make a chart of all possible walks, counting those that satisfy the Hitting Time Theorem.
  5. Show that hi > hj,ij h j = x ij hi > xdx = 1 σjϕ x μj σj ij 1 Φ x μi σi dx.

    The calculation uses the usual notation: ϕ is the standard normal probability density (pdf), Φ is the cumulative distribution (cdf) of the standard normal distribution, and μk and σk are the means and standard deviations.

  6. Show that n=kP(2n; 2k,p) = 1 (sums to 1) n=k2nP(2n; 2k,p) = 2k 2q1 (computing the mean) n=k(2n 2k 2p1)2P(2n; 2k,p) = 8np(1p) (2p1)3 (computing the variance)
  7. Modify the scripts for the probability of winning the horse race to calculate the probability of horses 18, 17, …winning. At what stage do the calculations become unfeasible?
  8. If available, modify the scripts for the probability of winning the horse race to calculate the probability using either different numerical integration algorithms or infinite intervals or both. How do the results compare to the results in the horse race example?
  9. Modify the scripts to display all walks with 4 steps +1 and 3 steps 1 to illustrate the Hitting Time Theorem.
  10. Consider
    Tn = k,Tt < k, for all t < n = k n Tn = k.

    Provide a detailed argument that when n = 1, both sides are 0 when k > 1 and k = 0, and are equal to Y 1 = 1 when k = 1. This is the base case of the induction.

    1. Give an example of a set of random variables with joint probability distribution that is exchangeable, but the set of random variables violates the condition “independent and identically distributed”. Keep the example as small and simple as possible.
    2. Give an example of a set of random variables with joint probability distribution that is invariant under rotations, but the set of random variables violates the condition “independent and identically distributed”. Keep the example as small and simple as possible.
    3. Give an example of a set of random variables with joint probability distribution that is exchangeable, but the set of random variables violates the condition of invariant under rotations. Keep the example as small and simple as possible.

__________________________________________________________________________

Books

Reading Suggestion:

__________________________________________________________________________

Links

Outside Readings and Links:

__________________________________________________________________________

I check all the information on each page for correctness and typographical errors. Nevertheless, some errors may occur and I would be grateful if you would alert me to such errors. I make every reasonable effort to present current and accurate information for public use, however I do not guarantee the accuracy or timeliness of information on this website. Your use of the information from this website is strictly voluntary and at your risk.

I have checked the links to external sites for usefulness. Links to external websites are provided as a convenience. I do not endorse, control, monitor, or guarantee the information contained in any external website. I don’t guarantee that the links are active at all times. Use the links here with the same caution as you would all information on the Internet. This website reflects the thoughts, interests and opinions of its author. They do not explicitly represent official positions or policies of my employer.

Information on this website is subject to change without notice.

Steve Dunbar’s Home Page, http://www.math.unl.edu/~sdunbar1

Email to Steve Dunbar, sdunbar1 at unl dot edu

Last modified: Processed from LATEX source on August 14, 2017