Steven R. Dunbar
Department of Mathematics
203 Avery Hall
University of Nebraska-Lincoln
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466
Stochastic Processes and
Advanced Mathematical Finance
__________________________________________________________________________
The Absolute Excess of Heads over Tails
_______________________________________________________________________
Note: These pages are prepared with MathJax. MathJax is an open source JavaScript display engine for mathematics that works in all browsers. See http://mathjax.org for details on supported browsers, accessibility, copy-and-paste, and other features.
_______________________________________________________________________________________________
Mathematically Mature: may contain mathematics beyond calculus with proofs.
_______________________________________________________________________________________________
What does the law of averages have to say about the probability of having a ﬁxed lead of say 20 Heads or more over Tails or 20 Tails or more over Heads at the end of a coin ﬂipping game of some ﬁxed duration? What does the Weak Law of Large Numbers have to say about having a ﬁxed lead? What does the Weak Law have to say about having a proportional lead, say 1%? What does the Central Limit Theorem have to say about the lead?
_______________________________________________________________________________________________
__________________________________________________________________________
__________________________________________________________________________
Probability theory generally has two classes of theorems about the results of coin-tossing games and therefore random walks:
In this section we will ask two related questions about the net fortune in a coin-tossing game:
Using the Central Limit Theorem, we will be able to provide precise answers to each question and then to apply the ideas to interesting questions in gambling and ﬁnance.
Often when using the Central Limit Theorem to approximate a discrete distribution, especially the binomial distribution, we adopt the half-integer correction, also called the continuity correction. The correction arises because the binomial distribution has a discrete distribution while the standard normal distribution has a continuous distribution. For any integer $s$ and real value $h$ with $0\le h<1$ the binomial random variable ${S}_{n}$ has $\mathbb{P}n\left[\left|{S}_{n}\right|\le s\right]=\mathbb{P}n\left[\left|{S}_{n}\right|\le s+h\right]$, yet the corresponding Central Limit Theorem approximation with the standard normal cumulative distribution function, $\mathbb{P}\left[\left|Z\right|\le \left(s+h\right)\u2215\sqrt{n}\right]$, increases as $h$ increases from $0$ to $1$. It is customary to take $h=1\u22152$ to interpolate the diﬀerence. This choice is also justiﬁed by looking at the approximation of the binomial with the normal, see Figure 1.
Symbolically, the half-integer correction to the Central Limit Theorem is
$$\begin{array}{llll}\hfill \mathbb{P}n\left[a\le {S}_{n}\le b\right]& \approx {\int}_{\left(\left(a-1\u22152\right)-np\right)\u2215\sqrt{npq}}^{\left(\left(b+1\u22152\right)-np\right)\u2215\sqrt{npq}}\frac{1}{\sqrt{2\pi}}exp\left(-{u}^{2}\u22152\right)\phantom{\rule{0.3em}{0ex}}du\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\mathbb{P}\left[\frac{\left(a-1\u22152\right)-np}{\sqrt{npq}}\le Z\le \left(b+1\u22152\right)-np\sqrt{npq}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$$for integers $a$ and $b$.
Consider the sequence of independent random variables ${Y}_{i}$ which take values $1$ with probability $1\u22152$ and $-1$ with probability $1\u22152$. This is a mathematical model of a fair coin ﬂip game where a $1$ results from “heads” on the $i$th coin toss and a $-1$ results from “tails”. Let ${H}_{n}$ and ${L}_{n}$ be the number of heads and tails respectively in $n$ ﬂips. Then ${T}_{n}={\sum}_{i=1}^{n}{Y}_{i}={H}_{n}-{L}_{n}$ counts the diﬀerence between the number of heads and tails, an excess of heads if positive, and a “negative excess”, i.e. a deﬁcit, if negative. Rather than the clumsy extended phrase “the number of heads exceeds the number of tails or the number of tails exceeds the number of heads” we can say “the absolute excess of heads $\left|{T}_{n}\right|$.” The value ${T}_{n}$ also represents the net “winnings”, positive or negative, of a gambler in a fair coin ﬂip game.
Corollary 1. Under the assumption that ${Y}_{i}=+1$ with probability $1\u22152$ and ${Y}_{i}=-1$ with probability $1\u22152$, and ${T}_{n}={\sum}_{i=1}^{n}{Y}_{i}$, then for an integer $s$
where $Z$ is a standard normal random variable with mean $0$ and variance $1$.
Proof. Note that $\mu =\mathbb{E}\left[{Y}_{i}\right]=0$ and ${\sigma}^{2}=Var\left[{Y}_{i}\right]=1$.
$$\begin{array}{llll}\hfill \mathbb{P}n\left[\left|{T}_{n}\right|>s\right]& =1-\mathbb{P}n\left[-s\le {T}_{n}\le s\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =1-\mathbb{P}n\left[-s-1\u22152\le {T}_{n}\le s+1\u22152\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =1-\mathbb{P}n\left[\left(-s-1\u22152\right)\u2215\sqrt{n}\le {T}_{n}\u2215\sqrt{n}\le \left(s+1\u22152\right)\u2215\sqrt{n}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & \approx 1-\mathbb{P}\left[\left(-s-1\u22152\right)\u2215\sqrt{n}\le Z\le \left(s+1\u22152\right)\u2215\sqrt{n}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\mathbb{P}\left[\left|Z\right|\ge \left(s+1\u22152\right)\u2215\sqrt{n}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & \phantom{\rule{2em}{0ex}}& \hfill \end{array}$$ □The crucial step occurs at the approximation, and uses the Central Limit Theorem. More precise statements of the Central Limit Theorem such as the Berry-Esséen inequality can turn the approximation into a inequality.
If we take $s$ to be ﬁxed we now have the answer to our ﬁrst question: The probability of an absolute excess of heads over tails greater than a ﬁxed amount in a fair game of duration $n$ approaches $1$ as $n$ increases.
The Central Limit Theorem in the form of the half-integer correction above provides an alternative proof of the Weak Law of Large Numbers for the speciﬁc case of the binomial random variable ${T}_{n}$. In fact,
$$\begin{array}{llll}\hfill \mathbb{P}n\left[\left|\frac{{T}_{n}}{n}\right|>\mathit{\epsilon}\right]& \approx \mathbb{P}\left[\left|Z\right|\ge \frac{\mathit{\epsilon}n+1\u22152}{\sqrt{n}}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\mathbb{P}\left[\left|Z\right|\ge \mathit{\epsilon}\sqrt{n}+\frac{1\u22152}{\sqrt{n}}\right]\to 0\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$$as $n\to \infty $.
Rewriting $\mathbb{P}\left[\left|\frac{{T}_{n}}{n}\right|>\mathit{\epsilon}\right]=\mathbb{P}\left[\left|{T}_{n}\right|>\mathit{\epsilon}n\right]$ this restatement of the Weak Law actually provides the answer to our second question: The probability that the absolute excess of heads over tails is greater than a ﬁxed fraction of the ﬂips in a fair game of duration $n$ approaches $0$ as $n$ increases.
Finally, the half-integer correction gives an estimate on the central probability in a binomial distribution.
as $n\to \infty $.
We can estimate this further:
$$\begin{array}{llll}\hfill \mathbb{P}\left[\left|Z\right|<\left(1\u22152\right)\u2215\sqrt{n}\right]& =\frac{1}{\sqrt{2\pi}}{\int}_{-1\u2215\left(2\sqrt{n}\right)}^{1\u2215\left(2\sqrt{n}\right)}{e}^{-{u}^{2}\u22152}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}du\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\frac{1}{\sqrt{2\pi}}{\int}_{-1\u2215\left(2\sqrt{n}\right)}^{1\u2215\left(2\sqrt{n}\right)}1-{u}^{2}\u22152+{u}^{4}\u22158+\dots \phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}du\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\frac{1}{\sqrt{2\pi}}\frac{1}{\sqrt{n}}-\frac{1}{24\sqrt{2\pi}}\frac{1}{{n}^{3\u22152}}+\frac{1}{640\sqrt{2\pi}}\frac{1}{{n}^{5\u22152}}+\dots .\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$$So we see that $\mathbb{P}\left[{T}_{n}=0\right]$ goes to zero at the rate of $1\u2215\sqrt{n}$.
What is the probability that the number of heads exceeds the number of tails by more than $20$ or the number of tails exceeds the number of heads by more than $20$ after $500$ tosses of a fair coin? By the proposition, this is:
This is a reasonably large probability, and is larger than many people would expect.
Figure 2 is a graph of the probability of at least $s$ excess heads in $500$ tosses of a fair coin:
What is the probability that there is “about the same number of heads as tails” in 500 tosses? Here we interpret “about the same” as within $5$, that is, an absolute diﬀerence of 1% or less of the number of tosses. Note that since $500$ is even the diﬀerence in the number of heads and tails cannot be an odd number, so must be either $0$, $2$ or $4$.
so it would be somewhat unusual (in that it occurs in less than 20% of games) to have the number of heads and tails so close.
Suppose you closely follow a stock recommendation source whose methods use technical analysis. You accept every bit of advice from this source about trading stocks. You choose $10$ stocks to buy, sell or hold every day based on the recommendations. Each day for each stock you will gain or lose money based on the advice. Note that it is possible to gain money even if the advice says the stocks will decrease in value, say by short-selling or using put options. How good can this strategy be? We will make this vague question precise by asking “How good does the information from the technical analysis have to be so that the probability of losing money over a year’s time is $1$ in $10,000$?”
The $10$ stocks over $260$ business days in a year means that there $2,600$ daily gains or losses. Denote each daily gain or loss as ${X}_{i}$, if the advice is correct you will gain ${X}_{i}>0$ and if the advice is wrong you will lose ${X}_{i}<0$. We want the total change ${X}_{\text{annual}}={\sum}_{i=1}^{2600}{X}_{i}>0$ and we will measure that by asking that $\mathbb{P}\left[{X}_{\text{annual}}<0\right]$ be small. In the terms of this section, we are interested in the complementary probability of an excess of successes over failures.
We assume that the changes are random variables, identically distributed, independent and the moments of all the random variables are ﬁnite. We will make speciﬁc assumptions about the distribution later, for now these assumptions are suﬃcient to apply the Central Limit Theorem. Then the total change ${X}_{\text{annual}}={\sum}_{i=1}^{2600}{X}_{i}$ is approximately normally distributed with mean $\mu =2600\cdot \mathbb{E}\left[{X}_{1}\right]$ and variance ${\sigma}^{2}=2600\cdot Var\left[{X}_{1}\right]$. Note that here again we are using the uncentered and unscaled version of the Central Limit Theorem. In symbols
We are interested in
By the change of variables $v=\left(u-\mu \right)\u2215\sigma $, we can rewrite the probability as
so that the probability depends only on the ratio $-\mu \u2215\sigma $. We desire that $\Phi \left(-\mu \u2215\sigma \right)=\mathbb{P}\left[{X}_{\text{annual}}<0\right]<1\u221510,000$. Then we can solve for $-\mu \u2215\sigma \approx -3.7$. Since $\mu =2600\cdot \mathbb{E}\left[{X}_{1}\right]$ and ${\sigma}^{2}=2600\cdot Var\left[{X}_{1}\right]$, we calculate that for the total annual change to be a loss we must have $\mathbb{E}\left[{X}_{1}\right]\u2215\sqrt{Var\left[{X}_{1}\right]}\approx \left(3.7\u2215\sqrt{2600}\right)=0.073$.
Now we consider what the requirement $\mathbb{E}\left[{X}_{1}\right]=0.07\cdot \sqrt{Var\left[{X}_{1}\right]}$ means for speciﬁc distributions. If we assume that the individual changes ${X}_{i}$ are normally distributed with a positive mean, then we can use $\mathbb{E}\left[{X}_{1}\right]\u2215\sqrt{Var\left[{X}_{1}\right]}=0.07$ to calculate that $\mathbb{P}\left[{X}_{1}\right]<0\approx 0.47210$, or about $47\%$. Then the probability of a prediction leading to a positive gain is about $53\%$. Alternatively, if we assume that the individual changes ${X}_{i}$ are binomial random variables where the probability of a positive gain is $\mathbb{P}\left[{X}_{1}=1\right]=p$, then $\mathbb{E}\left[{X}_{1}\right]=2p-1$ and $Var\left[{X}_{1}\right]=4p\left(1-p\right)$. We can use $2p-1=\mathbb{E}\left[{X}_{1}\right]=0.07Var\left[{X}_{1}\right]=0.07\cdot \left(4p\left(1-p\right)\right)$ to solve for $p$. The result is $p=0.53483$.
In either case, this means that any given piece of advice only has to have a $53\%$ chance of being correct in order to have a perpetual money-making machine. Compare this with the strategy of using a coin ﬂip to provide the advice. Since we don’t observe any perpetual money-making machines, we conclude that any advice about stock picking must be less than $53\%$ reliable or about the same as ﬂipping a coin.
Now suppose that instead we have a computer algorithm predicting stock movements for all publicly traded stocks, of which there are about $2,000$. Suppose further that we wish to restrict the chance that $\mathbb{P}\left[{X}_{\text{annual}}\right]<1{0}^{-6}$, that is 1 chance in a million. Then we can repeat the analysis to show that the computer algorithm would only need to have $\mathbb{P}\left[{X}_{1}\right]<0\approx 0.49737$, practically indistinguishable from a coin ﬂip, in order to make money. This provides a statistical argument against the utility of technical analysis for stock price prediction. Not losing money is not suﬃcient evidence to distinguish ability in stock-picking from coin-ﬂipping.
This section is adapted from the article “Tossing a Fair Coin” by Leonard Lipkin. The discussion of the continuity correction is adapted from The Continuity Correction in the section The Central Limit Theorem. in the Virtual Laboratories in Probability and Statistics. The third example in this section is adapted from a presentation by Jonathan Kaplan of D.E. Shaw and Co. in summer 2010.
_______________________________________________________________________________________________
The experiment is ﬂipping a coin $n$ times, and repeat the experiment $k$ times. Then compute the proportion for which the excess of heads over tails is more than $s$. Compare this to the theoretical probability from the standard normal distribution.
Octave script for Excess Heads.
Perl PDL script for Excess Heads.
Scientiﬁc Python script for Excess Heads.
__________________________________________________________________________
__________________________________________________________________________
[1] William Feller. An Introduction to Probability Theory and Its Applications, Volume I, volume I. John Wiley and Sons, third edition, 1973. QA 273 F3712.
[2] Emmanuel Lesigne. Heads or Tails: An Introduction to Limit Theorems in Probability, volume 28 of Student Mathematical Library. American Mathematical Society, 2005.
[3] Leonard Lipkin. Tossing a fair coin. The College Mathematics Journal, 34(2):128–133, March 2003.
__________________________________________________________________________
__________________________________________________________________________
I check all the information on each page for correctness and typographical errors. Nevertheless, some errors may occur and I would be grateful if you would alert me to such errors. I make every reasonable eﬀort to present current and accurate information for public use, however I do not guarantee the accuracy or timeliness of information on this website. Your use of the information from this website is strictly voluntary and at your risk.
I have checked the links to external sites for usefulness. Links to external websites are provided as a convenience. I do not endorse, control, monitor, or guarantee the information contained in any external website. I don’t guarantee that the links are active at all times. Use the links here with the same caution as you would all information on the Internet. This website reﬂects the thoughts, interests and opinions of its author. They do not explicitly represent oﬃcial positions or policies of my employer.
Information on this website is subject to change without notice.
Steve Dunbar’s Home Page, http://www.math.unl.edu/~sdunbar1
Email to Steve Dunbar, sdunbar1 at unl dot edu
Last modiﬁed: Processed from LATEX source on July 23, 2016