Steven R. Dunbar
Department of Mathematics
203 Avery Hall
University of Nebraska-Lincoln
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466

Topics in
Probability Theory and Stochastic Processes
Steven R. Dunbar

__________________________________________________________________________

The Hitting Time Theorem

_______________________________________________________________________

Note: These pages are prepared with MathJax. MathJax is an open source JavaScript display engine for mathematics that works in all browsers. See http://mathjax.org for details on supported browsers, accessibility, copy-and-paste, and other features.

_______________________________________________________________________________________________

### Rating

Mathematicians Only: prolonged scenes of intense rigor.

_______________________________________________________________________________________________

### Section Starter Question

Suppose a simple random walk of $6$ steps with ${Y}_{i}=±1$ has $4$ steps $+1$ and $2$ steps $-1$. Explicitly enumerate all the possible walks. In how many ways does the walk stay less than $2$ throughout the entire walk until step $6$?

_______________________________________________________________________________________________

### Key Concepts

1. The Hitting Time Theorem expresses a conditional probability. The Hitting Time Theorem says that under reasonable, or even relatively weak conditions on the steps, the conditional probability that an $n$-step random walk starting at $0$, given that it ends at $k>0$, ﬁrst hits $k$ at time $n$ is $k∕n$.
2. If the ${Y}_{i}$ are i. i. d. r. v. or the joint probability distribution of $\left({Y}_{1},\dots ,{Y}_{n}\right)$ is invariant under rotations or exchangeable, then

3. The Hitting Time Theorem is equivalent to the Ballot Theorem.
4. The conditions of invariance under rotations and exchangeability are weaker than the traditional requirement that ${Y}_{i}$ be independent and identically distributed.

__________________________________________________________________________

### Vocabulary

1. The Hitting Time Theorem shows the conditional probability that an $n$-step random walk with steps $\left({Y}_{1},\dots ,{Y}_{n}\right)$ having a joint probability distribution with reasonable conditions starting at $0$, given that it ends at height $k>0$, ﬁrst hits $k$ at time $n$ is $k∕n$.
2. If the joint probability distribution of $\left({Y}_{1},\dots ,{Y}_{n}\right)$ is the same as the joint probability distribution of the rotated sequence
$\left({Y}_{r+1},\dots ,{Y}_{n},{Y}_{1},\dots ,{Y}_{r}\right)$

then we say the distribution is invariant under rotations.

3. A random vector $\left({Y}_{1},\dots ,{Y}_{n}\right)$ is exchangeable if the joint probability distribution is invariant for any permutation of $\left({Y}_{1},\dots ,{Y}_{n}\right)$.

__________________________________________________________________________

### Mathematical Ideas

#### Background and History

The Ballot Theorem can be interpreted as the probability that, during the vote counting in an election between two candidates, the winning candidate is always ahead of the other. Suppose that in an election with $n$ votes, candidate A receives $a$ votes and candidate B receives $b$ votes, where $a-b=k$ for some positive integer $k$. Assume that all ballot permutations are equally likely during the counting. By dividing by the total number of ballot permutations in the counting of the specialized ballot theorem, the probability is $k∕n$, see The Ballot Theorem and the Reﬂection Principle.

The problem can be restated as the probability that an $n$-step random walk taking independent and identically distributed $±1$ steps stays positive after time $0$, given that it ends at height $k>0$, is $k∕n$. If each step has probability $1∕2$, this probability can be derived easily using the reﬂection principle, an argument appearing in many textbooks. In fact, it is possible to prove the same result under weaker hypotheses, [2], [5].

The Hitting Time Theorem says the conditional probability that an $n$-step random walk starting at $0$, given that it ends at $k>0$, ﬁrst hits $k$ at time $n$ is $k∕n$.

Remark. Note that:

• The Hitting Time Theorem expresses a conditional probability.
• By time reversal and symmetry around $k$, the Hitting Time Theorem is equivalent to the statement that an $n$-step random walk taking independent and identically distributed $±1$ steps stays positive after time $0$, given that it ends at height $k>0$.
• Note that the Hitting Time Theorem is diﬀerent from the Gambler’s Ruin. In its usual form the Gambler’s Ruin gives the probability that a random walk, given that it starts from a positive value ever hits the value $k$ before it hits the value $0$.
• The Hitting Time Theorem is also diﬀerent from the Positive Walks Theorem. The Positive Walks Theorem gives the probability that an $n$-step random walk starting at $0$ is always positive throughout the remainder of the walk, without regard to the value of the endpoint.
• The Hitting Time Theorem has other applications to branching processes and random graphs, see [5] for references.

In 1949 Otter gave the ﬁrst proof of the Hitting Time Theorem for $k=1$. Kemperman in 1961 and Dwass in 1969 gave proofs for $k\ge 2$ using generating functions and the Lagrange inversion formula. For simple Bernoulli random walks making steps of size $±1$, a proof using the Reﬂection Principle is in [1]. See [5] for references and discussion.

#### The Hitting Time Theorem for Independent Identically Distributed Random Variables

Fix $n\ge 1$. Let $\left({Y}_{1},\dots ,{Y}_{n}\right)$ be a sequence of random variables ${Y}_{i}$ taking values in $\left\{\dots ,-2,-1,0,1\right\}$. Note that this is more general than, but includes, the standard random walk where the random variables take values in $\left\{-1,1\right\}$. Deﬁne the walk $T=\left({T}_{0},\dots ,{T}_{n}\right)$ as the usual piecewise linear function deﬁned by the values ${T}_{i}=\sum _{j=0}^{i}{Y}_{j}$ at the integers $i=0,1,2,\dots ,n$. Note that because the only positive value assumed by the steps ${Y}_{i}$ is $1$, for the random walk $T$ to gain a height $k$, it has to pass through all $k-1$ intermediate integer values.

Theorem 1 (Hitting Time Theorem). Let $k\ge 0$. For a random walk starting at $0$ with i. i. d. integer-valued steps ${Y}_{i}$ with ${Y}_{i}\le 1$ for $i=1,2,3\dots$ then

Proof. The proof proceeds by induction on $n\ge 1$. When $n=1$, both sides are $0$ when $k>1$ and $k=0$, and are equal to $ℙ\left[{Y}_{1}=1\right]$ when $k=1$. This is the base case of the induction.

For the induction step, take $n\ge 2$ and consider the case $k=0$. In this case, the left side requires ${T}_{t}<0$ for all $t, yet ${T}_{0}=0$ so the event is empty and the probability is $0$. The right side is also equal to $0$ so the case $k=0$ is satisﬁed. Thus, we may assume that $k\ge 1$.

So take $n\ge 2$ and consider the case $k>0$. Condition on the ﬁrst step to obtain

For a given $s$ and $m=0,\dots ,n-1$, let ${T}_{m}^{\left(s\right)}={T}_{m+1}-s$. Because the steps are independent, identically distributed random variables, by the random walk Markov property, with ${T}_{0}=0$ and conditionally on ${Y}_{1}=s$ the random walk ${\left\{{T}_{m}^{\left(s\right)}\right\}}_{m=0}^{n-1}$ has the same distribution as the random walk path ${\left\{{T}_{m}\right\}}_{m=0}^{n-1}$.

Figure 1: Illustration of ${\left\{{T}_{m}\right\}}_{m=0}^{n-1}$ and ${\left\{{T}_{m}^{\left(s\right)}\right\}}_{m=0}^{n-1}$ for $s=-1$, shown in red.

That is,

where in the last equality we have used the induction hypothesis which is allowed since $n-1\ge 1$ and $k\ge 1$ with $s\le 1$ so $k-s\ge 0$. Because the steps are independent, identically distributed random variables $ℙ\left[{T}_{n-1}^{\left(s\right)}=k-s\right]=ℙ\left[{T}_{n}=k\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{Y}_{1}=s\right]$, (see Figure 1) substitute into the summation to obtain

Temporarily ignore the $n-1$ in the denominator and consider the summation

$\sum _{s=-\infty }^{1}\left(k-s\right)ℙ\left[{T}_{n}=k\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{Y}_{1}=s\right]ℙ\left[{Y}_{1}=s\right].$

By the properties of conditional expectation

$\begin{array}{cc}& ℙ\left[{T}_{n}=k\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{Y}_{1}=s\right]ℙ\left[{Y}_{1}=s\right]=\\ & ℙ\left[{T}_{n}=k,{Y}_{1}=s\right]=\\ & ℙ\left[{Y}_{1}=s\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{n}=k\right]ℙ\left[{T}_{n}=k\right].& \text{(1)}\end{array}$

Thus,

$\begin{array}{lll}\hfill \sum _{s=-\infty }^{1}\left(k-s\right)ℙ\left[{T}_{n}=k\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{Y}_{1}=s\right]ℙ\left[{Y}_{1}=s\right]& \phantom{\rule{2em}{0ex}}& \hfill \\ \hfill & \phantom{\rule{2em}{0ex}}=\sum _{s=-\infty }^{1}\left(k-s\right)ℙ\left[{Y}_{1}=s\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{n}=k\right]ℙ\left[{T}_{n}=k\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & \phantom{\rule{2em}{0ex}}=ℙ\left[{T}_{n}=k\right]\left(k-𝔼\left[{Y}_{1}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{n}=k\right]\right).\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

By the assumption of independent, identically distributed steps, the conditional expectation $𝔼\left[{Y}_{i}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{n}=k\right]$ is independent of $i$ so that

$\begin{array}{llll}\hfill 𝔼\left[{Y}_{i}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{n}=k\right]& =\frac{1}{n}\sum _{i=1}^{n}𝔼\left[{Y}_{i}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{n}=k\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\frac{1}{n}𝔼\left[\sum _{i=1}^{n}{Y}_{i}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{n}=k\right]=\frac{k}{n},\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

since ${\sum }_{i=1}^{n}{Y}_{i}={T}_{n}=k$ when ${T}_{n}=k$. Recalling the temporarily ignored factor of $\frac{1}{n-1}$ we arrive at

This completes the proof by induction. □

Example. See Figure 2 for an example.

Figure 2: Example of the Hitting Time Theorem with $n=6$, $k=2$, $a=4$, $b=2$.

##### Example Application of the Hitting Time Theorem

The following example is adapted from [3] and [4]. The original problem is:

The Kentucky Derby is on Saturday, and a ﬁeld of $20$ horses is slated to run “the fastest two minutes in sports” in pursuit of the right to be draped with a blanket of roses. But let’s consider, instead, the Lucky Derby, where things are a little more bizarre:

The bugle sounds, and $20$ horses make their way to the starting gate for the ﬁrst annual Lucky Derby. These horses, all trained at the mysterious Riddler Stables, are special. Each second, every Riddler-trained horse takes one step. Each step is exactly one meter long. But what these horses exhibit in precision, they lack in sense of direction. Most of the time, their steps are forward (toward the ﬁnish line) but the rest of the time they are backward (away from the ﬁnish line). As an avid fan of the Lucky Derby, you’ve done exhaustive research on these $20$ competitors. You know that Horse One goes forward $52$ percent of the time, Horse Two $54$ percent of the time, Horse Three $56$ percent, and so on, up to the favorite ﬁlly, Horse Twenty, who steps forward $90$ percent of the time. The horses’ steps are taken independently of one another, and the ﬁnish line is $200$ meters from the starting gate.

Handicap this race and place your bets! In other words, what are the odds (a percentage is ﬁne) that each horse wins?

It is easy to create a simulation for this scenario. A sample script is in the scripts section below. The simulation of $2000$ races suggests that the probability of horse $20$ winning is about $70.48%$ and the probability is about $22.86%$.

Consider a particular horse. For each step, suppose it moves forward with probability $p$ (in this example $p>1∕2$) or backward with probability $1-p$. Recall that reaching an even-numbered position $2k$ (such as $200$ in the problem statement) can only occur in an even number of steps, say $2n$. Then the horse must take $n+k$ forward steps and $n-k$ backward steps. The horse starts at position $0$ and we would like to know the probability that it will reach position $2k$ (where speciﬁcally in the example $k=100$ with $2k=200$) in exactly $2n$ steps. For convenience, adopt the following notation:

• $2k$ is a ﬁxed parameter, the value to be reached;
• $2n$ is a possible number of steps required to reach the value $2k$ where $2n\ge 2k$;
• let $H$ be the random variable for the hitting time, that is, $H=2n$ if ; and
• for $2n\ge 2k$

This probability mass distribution does not seem to have a common name, but it is still possible to compute the statistics of this distribution.

Theorem 2.

1. $P\left(2n;2k,p\right)$ is a probability mass distribution, that is
$\sum _{n=k}^{\infty }P\left(2n;2k,p\right)=1.$

2. $𝔼\left[H\right]=\frac{2k}{2p-1}$, that is
$\sum _{n=k}^{\infty }\left(2n\right)\phantom{\rule{0.3em}{0ex}}P\left(2n;2k,p\right)=\frac{k}{2p-1}.$

3. $Var\left[H\right]=\frac{8kp\left(1-p\right)}{{\left(2p-1\right)}^{3}}$, that is
$\sum _{n=k}^{\infty }{\left(2n-\frac{k}{2p-1}\right)}^{2}P\left(2n;2k,p\right)=\frac{8kp\left(1-p\right)}{{\left(2p-1\right)}^{3}}$

Proof. Starting with

$\sum _{n=k}^{\infty }\frac{k}{n}\left(\genfrac{}{}{0.0pt}{}{2n}{n+k}\right){p}^{n+k}{\left(1-p\right)}^{n-k}$

change the variables $n=j+k$ to rewrite the summation as

${p}^{2k}\sum _{j=0}^{\infty }\frac{k}{j+k}\left(\genfrac{}{}{0.0pt}{}{2\left(j+k\right)}{j}\right){p}^{j}{\left(1-p\right)}^{j}.$

This series resembling a binomial series expansion suggests that the series might have a closed form. In fact, using Maple to ﬁnd a closed form solution, obtain

$\sum _{j=0}^{\infty }\frac{k}{j+k}\left(\genfrac{}{}{0.0pt}{}{2\left(j+k\right)}{j}\right){x}^{j}=\frac{{2}^{2k}}{{\left(1+\sqrt{1-4x}\right)}^{2k}}.$

In turn the closed form for the power series suggests that it might be possible to express the moment-generating function in closed form. Again using Maple to ﬁnd a closed form expression

$\begin{array}{llll}\hfill {F}_{H}\left(t\right)& =𝔼\left[{\mathrm{e}}^{tH}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\sum _{n=k}^{\infty }\frac{k}{n}\left(\genfrac{}{}{0.0pt}{}{2n}{n+k}\right){p}^{n+k}{\left(1-p\right)}^{n-k}{\mathrm{e}}^{2nt}\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\frac{{\left(2p\right)}^{2k}{\mathrm{e}}^{2kt}}{{\left(1+\sqrt{1-4{\mathrm{e}}^{2t}p+4{\mathrm{e}}^{2t}{p}^{2}}\right)}^{2k}}.\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

Then, once more using Maple, it is each to check that

${F}_{H}\left(0\right)=\sum _{n=k}^{\infty }P\left(2n;2k,p\right)=1$

and

$𝔼\left[H\right]={F}_{H}^{\prime }\left(0\right)=\frac{2k}{2p-1}.$

The calculation of the variance is left as an exercise. □

For the horse race example, the mean number and standard deviation of steps required for each horse to reach position $200$ are in Table 1.

 Horse p Mean Variance StdDev 1 0.52 5000. 3120000. 1766.3522 2 0.54 2500. 388125. 622.99679 3 0.56 1666.6667 114074.07 337.74853 4 0.58 1250. 47578.125 218.12410 5 0.6 1000. 24000. 154.91933 6 0.62 833.33333 13634.259 116.76583 7 0.64 714.28571 8396.5015 91.632426 8 0.66 625. 5478.5156 74.016995 9 0.68 555.55556 3731.1385 61.083046 10 0.7 500. 2625. 51.234754 11 0.72 454.54545 1893.3133 43.512220 12 0.74 416.66667 1391.7824 37.306600 13 0.76 384.61538 1037.7788 32.214574 14 0.78 357.14286 781.70554 27.958997 15 0.8 333.33333 592.59259 24.343225 16 0.82 312.5 450.43945 21.223559 17 0.84 294.11765 341.94993 18.491888 18 0.86 277.77778 258.05898 16.064214 19 0.88 263.15789 192.44788 13.872559 20 0.9 250. 140.625 11.858541

Table 1: Statistics for the hitting time distribution for each horse in the race.

The probability distributions seem to be approximately normal in shape, see Figure 3. This is reasonable since the hitting time to level $2k$ is the sum of $2k$ random one-level hitting times, ﬁrst the hitting time from the initial point $0$ to ﬁrst reach $1$, then the independent hitting time to ﬁrst reach $2$ from $1$, and so on. The one-level hitting times are independent since the individual steps of the random walk are independent. Thus, as the sum of many independent random variables, the distribution of the hitting time to $2k$ is approximately normal.

Figure 3: Representative probability mass distributions for horses 20, 16, 12, and 8.

Suppose each horse $i$ ﬁnishes in ${h}_{i}$ steps. Computing the probability that speciﬁc horse $j$ wins the race amounts to ﬁnding the probability that ${h}_{i}>{h}_{j}$ for all $i\ne j$. Because the probability distribution is completely speciﬁed, it is possible to explicitly compute the probability that for example horse $20$ is the winner. In R, ﬁll vectors of length $10,001$ with the probability distribution for each horse. This covers the probability of each horse winning in the minimum possibility of $200$ steps up to $20,000$ steps, about $3$ standard deviations more than the mean number of steps required for the slowest horse $1$ to win. Calculating a cumulative sum along this vector, gives the cumulative probability distribution for each horse. Subtracting the cumulative probability distribution from $1$ is the complementary cumulative probability. At each index or step number ${h}_{j}$, this is the probability that horse reaches the goal in more than ${h}_{j}$ steps. Multiply all of these probabilities from horse $1$ to horse $19$, and then multiply by the probability of horse $20$ winning in ${h}_{j}$ steps. Computing the product of so many small magnitude factors could lead to numerical underﬂow, so instead compute the exponential of the sum and of the respective logarithms. Summing over all possible indices gives the probability of horse $20$ winning over all other horses, namely $0.7041477$

Since the probability distributions are approximately normal, one can also approximate this probability with:

$\begin{array}{llll}\hfill ℙ\left[{h}_{i}>{h}_{j},i\ne j\right]& \approx \underset{-\infty }{\overset{\infty }{\int }}ℙ\left[{h}_{j}=x\right]\prod _{i\ne j}ℙ\left[{h}_{i}>x\right]\phantom{\rule{0.3em}{0ex}}dx\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\underset{-\infty }{\overset{\infty }{\int }}\frac{1}{{\sigma }_{j}}\varphi \left(\frac{x-{\mu }_{j}}{{\sigma }_{j}}\right)\prod _{i\ne j}\left(1-\Phi \left(\left(\frac{x-{\mu }_{i}}{{\sigma }_{i}}\right)\right)\right)\phantom{\rule{0.3em}{0ex}}dx.\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

The probability calculation uses the usual notation: $\varphi$ is the standard normal probability density (pdf), $\Phi$ is the cumulative distribution (cdf) of the standard normal distribution, and ${\mu }_{k}$ and ${\sigma }_{k}$ are the means and standard deviations computed in Table 1.

Analytic integration of the probability is impractical, and even the numerical integration of

${H}_{j}\left(x\right)=\frac{1}{{\sigma }_{j}}\varphi \left(\frac{x-{\mu }_{j}}{{\sigma }_{j}}\right)\prod _{i\ne j}\left(1-\Phi \left(\frac{x-{\mu }_{i}}{{\sigma }_{i}}\right)\right)$

is challenging. Taking ${H}_{20}\left(x\right)$ as an example, the product term

$\prod _{i\ne 20}\left(1-\Phi \left(\frac{x-{\mu }_{i}}{{\sigma }_{i}}\right)\right)$

is approximately $1$ for $x<{\mu }_{19}-3{\sigma }_{19}\approx 221.5402$. Since each factor is less than $1$ and $1-\Phi \left(\left(x-{\mu }_{19}\right)∕{\sigma }_{19}\right)\approx 0$ for $x>{\mu }_{19}+3{\sigma }_{19}\approx 304.7756$, the product decreases to approximately $0$ for $x>304.7756$. The other factor $\frac{1}{{\sigma }_{20}}\varphi \left(\frac{x-{\mu }_{20}}{{\sigma }_{20}}\right)\approx 0$ outside the interval $\left({\mu }_{20}-3{\sigma }_{20},{\mu }_{20}+3{\sigma }_{20}\right)\approx \left(214.4244,285.5756\right)$ and has a maximum value of $\varphi \left(0\right)\approx 0.03364$ at $x=250$. Therefore, ${H}_{20}\left(x\right)$ is approximately $0$ outside the interval $\left(214.4244,285.5756\right)$ and reaches a maximum approximately ${H}_{20}\left(250\right)\approx 0.02633$. Again, computing the product of so many small magnitude factors could lead to numerical underﬂow, so instead the exponential of the sum and diﬀerence of the respective logarithms is computed.

The low variation with an aspect ratio of $\frac{1}{{\sigma }_{20}}\left(\varphi \left(0\right)\right)∕\left(6{\sigma }_{20}\right)\approx 1∕2115$ makes even adaptive integration challenging. The default integration routine of $R$ gives a nonsensical answer of $1.311008$. With the additional R library pracma of advanced numerical analysis routines, the quad function using adaptive Simpson quadrature yields $0.7086704$ and the quadl function using adaptive Lobatto quadrature yields the very similar $0.7078003$. Numerical integration with Scientiﬁc Python using either general purpose integration or Romberg integration gives similar results of $0.70867$.

The relative error between the exact discrete calculation and the approximate calculation with the normal distribution is $0.006422374$.

Therefore we estimate horse $20$ has about a $71%$ chance of winning. Repeating the analysis with the necessary changes, horse $19$ has about a $21%$ chance of winning. One of the other $18$ horses winning has about an $8%$ chance.

#### The Hitting Time Theorem Under Rotation Invariance

Fix $n\ge 1$. Let $\left({Y}_{1},\dots ,{Y}_{n}\right)$ be a sequence of random variables ${Y}_{i}$ taking values in $\left\{\dots ,-2,-1,0,1\right\}$. Deﬁne the walk $T=\left({T}_{0},\dots ,{T}_{n}\right)$ as the usual piecewise linear function deﬁned by the values ${T}_{i}=\sum _{j=0}^{i}{Y}_{j}$ at the integers $i=0,1,2,\dots ,n$. Note also that because the only positive value assumed by the steps ${Y}_{i}$ is $1$, for the random walk $T$ to gain a height $k$, it has to pass through all $k-1$ intermediate integer values. Given $r$ with $0, deﬁne the rotation of $T$ by $r$ as the walk ${T}^{\left(r\right)}=\left({T}_{0}^{\left(r\right)},\dots ,{T}_{n}^{\left(r\right)}\right)$ corresponding to the rotated sequence $\left({Y}_{r+1},\dots ,{Y}_{n},{Y}_{1},\dots ,{Y}_{r}\right)$. Another way to represent the rotation of the sequence is

${T}_{t}^{\left(r\right)}=\left\{\begin{array}{cc}{T}_{t+r}-{T}_{r}\phantom{\rule{1em}{0ex}}\hfill & 0\le t\le n-r;\hfill \\ {T}_{n}-{T}_{r}+{T}_{t+r-n}\phantom{\rule{1em}{0ex}}\hfill & n-r

Note that ${T}_{n}^{\left(r\right)}={T}_{n}+{T}_{n+r-n}-{T}_{r}={T}_{n}$ and the entire walk rotated by $n$ is the original walk, ${T}^{\left(n\right)}=T$. We say that ${T}^{\left(r\right)}$ peaks at $n$ if ${T}_{t}^{\left(r\right)}<{T}_{n}^{\left(r\right)}={T}_{n}$ for all $t. See Table 3 for several examples.

 $j$ 1 2 3 4 5 6 7 8 9 10 ${Y}_{j}$ -1 1 -1 -1 1 1 -1 -1 1 1 ${T}_{j}$ -1 0 -1 -2 -1 0 -1 -2 -1 0 ${Y}^{\left(5\right)}$ 1 -1 -1 1 1 -1 1 -1 -1 1 ${T}_{j}^{\left(5\right)}$ 1 0 -1 0 1 0 1 0 -1 0

Table 2: Example of a rotation of a Bernoulli random walk with ${Y}_{i}=±1$, $n=10$, and $r=5$.

Lemma 3. If ${T}_{n}=k\ge 1$, then exactly $k$ rotations of $T$ peak at $n$.

Proof. Set $M=max\left\{{T}_{0},\dots {T}_{n}\right\}$. If ${T}_{r}\le {T}_{s}$ for some $r$ and $s$ such that $0\le s, then ${T}_{n-r+s}^{\left(r\right)}={T}_{n}+{T}_{s}-{T}_{r}\ge {T}_{n}$, so ${T}^{\left(r\right)}$ can only peak at $n$ if $r={r}_{i}=min\left(t>0\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{t}=i\right)$ for some $i$ satisfying $0={T}_{0}. That is, ${T}^{\left(r\right)}$ can only peak at $n$ if there are no indices prior to $r$ with values greater than ${T}_{r}$. Since the values of ${T}_{j}$ has to pass through all intermediate integer values, $r$ must be ${r}_{i}=min\left(t>0\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{t}=i\right)$. The property of passing through all intermediate values implies that $0<{r}_{1}<{r}_{2}<\cdots <{r}_{M}$. Therefore, for $1\le i\le M$, $max\left\{\underset{t}{\overset{\left({r}_{i}\right)}{T}}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}0\le t\le n-{r}_{i}\right\}=max\left\{{T}_{t}:{r}_{i}\le t\le n\right\}-{T}_{{r}_{i}}=M-i$, and $max\left\{\underset{t}{\overset{{r}_{i}}{T}}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}n-r\le t. It follows that ${T}^{{r}_{i}}$ peaks at $n$ if and only if $M-{T}_{n}. □

Example. This example is a speciﬁc illustration of the proof of the lemma with $n=10$ and the random walk given in Table 3.

 $j$ 1 2 3 4 5 6 7 8 9 10 ${Y}_{j}$ 1 1 -1 1 -1 1 -1 1 1 1 ${T}_{j}$ 1 2 1 2 1 2 1 2 3 4 ${Y}_{j}^{\left(1\right)}$ 1 -1 1 -1 1 -1 1 1 1 1 ${T}_{j}^{\left(1\right)}$ 1 0 1 0 1 0 1 2 3 4 ${Y}_{j}^{\left(2\right)}$ -1 1 -1 1 -1 1 1 1 1 1 ${T}_{j}^{\left(2\right)}$ -1 0 -1 0 -1 0 1 2 3 4 ${Y}_{j}^{\left(3\right)}$ 1 -1 1 -1 1 1 1 1 1 -1 ${T}_{j}^{\left(3\right)}$ 1 0 1 0 1 2 3 4 5 4 ${Y}_{j}^{\left(4\right)}$ -1 1 -1 1 1 1 1 1 -1 1 ${T}_{j}^{\left(4\right)}$ -1 0 -1 0 1 2 3 4 3 4 ${Y}_{j}^{\left(5\right)}$ 1 -1 1 1 1 1 1 -1 1 -1 ${T}_{j}^{\left(5\right)}$ 1 0 1 2 3 4 5 4 5 4 ${Y}_{j}^{\left(6\right)}$ -1 1 1 1 1 1 -1 1 -1 1 ${T}_{j}^{\left(6\right)}$ -1 0 1 2 3 4 3 4 3 4 ${Y}_{j}^{\left(7\right)}$ 1 1 1 1 1 -1 1 -1 1 -1 ${T}_{j}^{\left(7\right)}$ 1 2 3 4 5 4 5 4 5 4 ${Y}_{j}^{\left(8\right)}$ 1 1 1 1 -1 1 -1 1 -1 1 ${T}_{j}^{\left(8\right)}$ 1 2 3 4 3 4 3 4 3 4 ${Y}_{j}^{\left(9\right)}$ 1 1 1 -1 1 -1 1 -1 1 1 ${T}_{j}^{\left(9\right)}$ 1 2 3 2 3 2 3 2 3 4

Table 3: Example of the lemma with $n=10$, counting the rotations which peak at $10$.

For the example, $M=max\left\{{T}_{0},\dots {T}_{10}\right\}=4$. For the example $1={T}_{5}\le {T}_{4}=2$ for $r=5$ and $s=4$ such that $0\le 4=s. Now consider ${T}_{10-5+4}^{\left(5\right)}={T}_{9}^{\left(5\right)}=5={T}_{10}+{T}_{4}-{T}_{5}=4+2-1=5\ge 4$. Therefore, ${T}^{\left(5\right)}$ cannot peak at $n=10$. Table 2 also illustrates that ${T}^{\left(5\right)}$ cannot peak at $n=10$.

Now consider $i=2$ satisfying $0={T}_{0}. Then $2={r}_{2}=min\left(t>0\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{t}=2\right)$. Note that ${T}^{\left(2\right)}$ peaks at $n=10$.

The deﬁnition of ${r}_{i}$ is ${r}_{i}=min\left(t>0\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{t}=i\right)$. For the example, $0<1={r}_{1}<2={r}_{2}<9={r}_{3}<10={r}_{4}$. The next point in the lemma is that for $1\le i\le M$, $max\left\{\underset{t}{\overset{\left({r}_{i}\right)}{T}}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}0\le t\le n-{r}_{i}\right\}=max\left\{{T}_{t}:{r}_{i}\le t\le n\right\}-{T}_{{r}_{i}}=M-i$. For this example, Table 3 gives all values for comparison, recalling that ${T}^{\left(10\right)}=T$.

 ${r}_{i}$ $max\left\{\underset{t}{\overset{\left({r}_{i}\right)}{T}}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}0\le t\le n-{r}_{i}\right\}$ $max\left\{{T}_{t}:{r}_{i}\le t\le n\right\}-{T}_{{r}_{i}}$ $M-i$ 1 3 $4-1=3$ $4-1=3$ 2 2 $4-2=2$ $4-2=2$ 9 1 $4-3=1$ $4-3=1$ 10 0 $4-4=0$ $4-4=0$

Finally, the last point in the lemma is that for $1\le i\le M$, $max\left\{\underset{t}{\overset{\left({r}_{i}\right)}{T}}:n-{r}_{i}\le t. For this example, Table 3 gives all values for comparison, recalling that ${T}^{\left(10\right)}=T$.

Deﬁnition. Call two walks equivalent if one is a rotation of the other. This deﬁnes an equivalence relation on the set of $n$-step walks.

Theorem 4 (Hitting Time Theorem). If the joint probability distribution of $\left({Y}_{1},\dots ,{Y}_{n}\right)$ is invariant under rotations, then

Proof. Consider a single equivalence class, and let $x=\left({x}_{1},\dots ,{x}_{n}\right)$ be the sequence of increments of one member of the class. Every walk in the class ends at the same height $k$, and the class contains either $n$ walks, or $n∕p$ walks for some divisor $p$ of $n$ if $\left({x}_{1},\dots ,{x}_{n}\right)$ happens to be periodic. In either case, the lemma implies that if $k\ge 1$ and the joint probability distribution of $Y=\left({Y}_{1},\dots ,{Y}_{n}\right)$ is invariant under rotations, then

Choose one sequence of increments $x$ from each equivalence class and sum over them to obtain the result. □

Remark. The condition that the joint probability distribution of $\left({Y}_{1},\dots ,{Y}_{n}\right)$ is invariant under rotations here is weaker than the traditional requirement that ${Y}_{i}$ be independent and identically distributed.

Remark. This proof is another version of the Cycle Lemma from the proof of the Ballot Theorem.

#### The Hitting Time Theorem Under Exchangeability

Deﬁnition. A random vector $\left({Y}_{1},\dots ,{Y}_{n}\right)$ is exchangeable if the joint probability distribution of $\left({Y}_{1},\dots ,{Y}_{n}\right)$ is invariant for any permutation of $\left({Y}_{1},\dots ,{Y}_{n}\right)$.

Theorem 5 (Hitting Time Theorem). If the joint probability distribution of $\left({Y}_{1},\dots ,{Y}_{n}\right)$ is exchangeable, then

Remark. The proof is not given here, see [5]. The idea of the proof is that the joint probability distribution of $\left({Y}_{1},\dots ,{Y}_{n}\right)$ conditioned on ${\sum }_{i=1}^{n}{Y}_{i}=k$ is still exchangeable. This implies the conditional expectation $𝔼\left[{Y}_{i}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{T}_{n}=0\right]$ is independent of $i$. Then the conditional expectation step still holds, allowing the completion of the induction argument.

#### Sources

The history and background is adapted from [5]. The proof for i. i. d. random variables is adapted from [5]. The proof under rotation invariance is adapted from the article by Kager, [2]. The comments about exchangeability are adapted from [5].

_______________________________________________________________________________________________

### Algorithms, Scripts, Simulations

#### Algorithm

##### Enumerating all walks

Set the number $a$ of upsteps and the number $b$ of downsteps. Then the total number of steps is the length $l=a+b$. There are then $\left(\genfrac{}{}{0.0pt}{}{a+b}{a-b}\right)$ places to have the downsteps. In a sequence of nested loops systematically place the downsteps for B in an $\left(a+b\right)×\left(\genfrac{}{}{0.0pt}{}{a+b}{a-b}\right)$ matrix of ones. Then cumulatively sum this matrix down the columns to get all possible walks and save the matrix for later plotting or analysis. Note that this algorithm does not scale well because of the inherent growth of the binomial coeﬃcients.

##### Simulating the Lucky Derby Example

Set all parameters for the simulations, using a suﬃcient length of the race for most of the fastest horses to complete the race. Simulate all races in a vectorized manner, using the usual technique of creating uniformly distributed random variates, comparing row-wise to the Bernoulli probability, and cumulatively summing. Then using vectorized match functions, extract the index of the winning horse (breaking ties with minimum index), and then counting the number of times each horse wins with the histogram counter.

##### Computing the Win Probability in the Lucky Derby Example

Using a normal approximation to the ﬁrst hitting time distribution for a Bernoulli random walk, construct the probability density for the $20$th walk winning by multiplying the density and the survival functions, assuming independence. Then using diﬀerent numerical quadrature routines, compute the approximate total probability of the $20$th walker winning.

#### Scripts

R
1speeds <- seq(from=0.52, to=0.90, by=0.02)
2
3nRaces <- 100000
4nHorses <- 20
5lengthRace <- 600
6distance <- 200
7
8randVals <- array( runif(nHorses * lengthRace * nRaces), c(nHorses,
9lengthRace, nRaces))
10stepsSim <- 2*(  randVals <= speeds ) - 1
11racesSim <- apply( stepsSim, c(1,3), cumsum)
12whenFinish <- apply( racesSim, c(1,3), ( function(x) match(distance, x)))
13winnerTime <- apply(whenFinish, 2, function(x) min(which(!is.na(x))))
14winner <- whenFinish[ cbind(winnerTime, 1:nRaces) ]
15hist(winner, breaks=seq(from=0.5, to=nHorses+0.5, by=1), plot=FALSE)
6cAp1x1-1700016:
1P <- function(n, k, p) { (k/n) * dbinom((n+k), 2*n, p) }
2
3speeds <- seq(from=0.52, to=0.90, by=0.02)
4dP <- matrix(0, 20, 10001)
5for (i in seq_along(speeds)) dP[i,] <- P(100:10100, 100, speeds[i])
6pP <- t(apply(dP, 1, cumsum))
7qP <- 1 - pP
8
9lqP  <- log(qP)
10lqPprod <- exp( log(dP[20, ]) + apply(lqP[1:19, ], 2, sum))
11cat("Probability horse 20 wins: ",  sum(lqPprod), "\n" )
6cAp2x1-1700012:
1distance <- 200
2p <- seq(0.52, 0.90, length=20)
3mu <- distance/(2*p-1)
4sigma <- sqrt( (4*distance*p*(1-p))/((2*p-1)^3) )
5
6h20 <- function(x) {
7    (1/sigma[20]) *
8        exp( dnorm((x-mu[20])/sigma[20],
9                   log=TRUE
10                  ) +
11             sum( pnorm((x - mu)/sigma,
12                        lower.tail=FALSE, log.p = TRUE
13                        )
14                ) -
15             pnorm( (x - mu[20])/sigma[20],
16                   lower.tail=FALSE, log.p = TRUE
17                  )
18           )
19}
20
21win20Integrate <- integrate(h20,
22                          mu[20]-3*sigma[20],
23                          mu[20]+3*sigma[20]
24                          )
25cat("integrate: ", win20Integrate\$value, "\n")
26
27library("pracma")
28win20Quad <- quad(h20,
29                mu[20]-3*sigma[20],
30                mu[20]+3*sigma[20]
31                )
32cat("quad: ", win20Quad, "\n")
33
34win20QuadL <- quadl(h20,
35                  mu[20]-3*sigma[20],
36                  mu[20]+3*sigma[20])
37cat("quadl: ", win20QuadL, "\n")
6cAp3x1-1700038:
Octave
1k = 1;
2a = 4;
3b = 2;
4
5steps = ones( a+b, nchoosek(a+b,a-b) );
6
7# Systematically place the downsteps for B
8# using a-b nested loops, here a-b = 2 loops
9c = 0;
10for i = 1:(a+b)-1
11  for j= (i+1):(a+b)
12    c = c + 1;
13    steps(i,c) = -k;
14    steps(j,c) = -k;
15  endfor;
16endfor;
17
18walk = cumsum(steps);
19
20# Put the walk in a matrix whose
21# first column is x-coordinates
22# and first row is 0 to start at origin
23plot_walk = zeros((a+b)+1, nchoosek(a+b,a-b) + 1);
24plot_walk(2:(a+b)+1, 2:nchoosek(a+b,a-b) + 1) = \
25walk;
26plot_walk(:,1) = 0:(a+b);
27
28save plot_walk.dat plot_walk
6cAp4x1-1700029:
SciPy
1import scipy
2
3distance = 200
4p = scipy.linspace(0.52, 0.90, 20)
5mu = distance / (2 * p -1)
6sigma = scipy.sqrt( (4 * distance * p * (1-p)) / ((2 * p - 1) ** 3))
7
8from scipy.stats import norm
9
10def h20(x):
11    p = ( (1 / sigma[20 - 1]) *
12        scipy.exp( scipy.stats.norm.logpdf((x-mu[20-1]) / sigma[20-1] ) +
13             scipy.sum( scipy.stats.norm.logsf((x - mu) / sigma )) -
14             scipy.stats.norm.logsf( (x - mu[20 - 1]) / sigma[20 - 1] )
15           )
16        )
17    return p
18
19import scipy.integrate as integrate
20
21win20Quad = integrate.quad(h20,
22                           mu[20 - 1] - 3 * sigma[20 - 1],
23                           mu[20 - 1] + 3 * sigma[20 - 1])
24print("quad", win20Quad[0])
25
26win20Romberg = integrate.romberg(h20,
27                                 mu[20 - 1] - 3 * sigma[20 - 1],
28                                 mu[20 - 1] + 3 * sigma[20 - 1])
29print("Romberg", win20Quad[0])
6cAp5x1-1700032:

__________________________________________________________________________

### Problems to Work for Understanding

1. Use the Hitting Time Theorem to prove the Ballot Theorem.
2. Using the Reﬂection Principle, prove the probability that an $n$-step random walk taking independent and identically distributed steps $±1$ with probability $1∕2$ stays positive after time $0$, given that it ends at height $k>0$, is $k∕n$.
3. Write a paragraph explaining in detail the diﬀerences and connections among the Hitting Time Theorem, the Positive Walks Theorem, and the Gambler’s Ruin. Provide speciﬁc example using numerical values for the parameters.
4. Consider the case of a $7$ step walk with ${Y}_{i}=±1$ with $4$ steps $+1$ and $3$ steps $-1$. Make a chart of all possible walks, counting those that satisfy the Hitting Time Theorem.
5. Show that $\begin{array}{llll}\hfill ℙ\left[{h}_{i}>{h}_{j},i\ne j\right]& \approx \underset{-\infty }{\overset{\infty }{\int }}ℙ\left[{h}_{j}=x\right]\prod _{i\ne j}ℙ\left[{h}_{i}>x\right]\phantom{\rule{0.3em}{0ex}}dx\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\underset{-\infty }{\overset{\infty }{\int }}\frac{1}{{\sigma }_{j}}\varphi \left(\frac{x-{\mu }_{j}}{{\sigma }_{j}}\right)\prod _{i\ne j}\left(1-\Phi \left(\left(\frac{x-{\mu }_{i}}{{\sigma }_{i}}\right)\right)\right)\phantom{\rule{0.3em}{0ex}}dx.\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

The calculation uses the usual notation: $\varphi$ is the standard normal probability density (pdf), $\Phi$ is the cumulative distribution (cdf) of the standard normal distribution, and ${\mu }_{k}$ and ${\sigma }_{k}$ are the means and standard deviations.

6. Show that
7. Modify the script using the exact probability distribution to calculate the probability of horse $19$, $18$, …, winning At what stage do the calculations become unfeasible?
8. Modify the approximate scripts for the probability of winning the horse race to calculate the probability of horses 18, 17, …winning. At what stage do the calculations become unfeasible? How do the calculations compare to the exact probabilities?
9. If available, modify the scripts for the probability of winning the horse race to calculate the probability using either diﬀerent numerical integration algorithms or inﬁnite intervals or both. How do the results compare to the results in the horse race example?
10. Modify the scripts to display all walks with $4$ steps $+1$ and $3$ steps $-1$ to illustrate the Hitting Time Theorem.
11. Consider

Provide a detailed argument that when $n=1$, both sides are $0$ when $k>1$ and $k=0$, and are equal to $ℙ\left[{Y}_{1}=1\right]$ when $k=1$. This is the base case of the induction.

1. Give an example of a set of random variables with joint probability distribution that is exchangeable, but the set of random variables violates the condition “independent and identically distributed”. Keep the example as small and simple as possible.
2. Give an example of a set of random variables with joint probability distribution that is invariant under rotations, but the set of random variables violates the condition “independent and identically distributed”. Keep the example as small and simple as possible.
3. Give an example of a set of random variables with joint probability distribution that is exchangeable, but the set of random variables violates the condition of invariant under rotations. Keep the example as small and simple as possible.

__________________________________________________________________________

### References

[1]   Geoﬀrey Grimmett and David Stirzaker. Probability and Random Processes. Oxford University Press, 2001.

[2]   Wouter Kager. The hitting time theorem revisited. The American Mathematical Monthly, 118(8):735–737, October 2011. http://dx/doi.org/10.4169/amer.math.mothly.118.08.735.

[3]   Laurent Lessard. The lucky derby. http://www.laurentlessard.com/bookproofs/tag/riddler/, May 2017.

[4]   Oliver Roeder. Who will win the lucky derby? https://ﬁvethirtyeight.com/features/who-will-win-the-lucky-derby/, May 2017.

[5]   Remco van der Hofstad and Michael Keane. An elementary proof of the hitting time theorem. The American Mathematical Monthly, 115(8):753–756, October 2008.

__________________________________________________________________________

### Outside Readings and Links:

__________________________________________________________________________

I check all the information on each page for correctness and typographical errors. Nevertheless, some errors may occur and I would be grateful if you would alert me to such errors. I make every reasonable eﬀort to present current and accurate information for public use, however I do not guarantee the accuracy or timeliness of information on this website. Your use of the information from this website is strictly voluntary and at your risk.

I have checked the links to external sites for usefulness. Links to external websites are provided as a convenience. I do not endorse, control, monitor, or guarantee the information contained in any external website. I don’t guarantee that the links are active at all times. Use the links here with the same caution as you would all information on the Internet. This website reﬂects the thoughts, interests and opinions of its author. They do not explicitly represent oﬃcial positions or policies of my employer.

Information on this website is subject to change without notice.

Steve Dunbar’s Home Page, http://www.math.unl.edu/~sdunbar1

Email to Steve Dunbar, sdunbar1 at unl dot edu

Last modiﬁed: Processed from LATEX source on February 5, 2019