Steven R. Dunbar
Department of Mathematics
203 Avery Hall
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466

Topics in
Probability Theory and Stochastic Processes
Steven R. Dunbar

__________________________________________________________________________

Binomial Distribution

_______________________________________________________________________

Note: These pages are prepared with MathJax. MathJax is an open source JavaScript display engine for mathematics that works in all browsers. See http://mathjax.org for details on supported browsers, accessibility, copy-and-paste, and other features.

_______________________________________________________________________________________________

### Rating

Mathematically Mature: may contain mathematics beyond calculus with proofs.

_______________________________________________________________________________________________

### Section Starter Question

Consider a family with 5 children. What is the probability of having all ﬁve children be boys? How many children must a couple have for at least a 0.95 probability of at least one girl? What is a proper and general mathematical framework for setting up the answer to these questions and similar questions?

_______________________________________________________________________________________________

### Key Concepts

1. A binomial random variable ${S}_{n}$ counts the number of successes in a sequence of $n$ trials of an experiment.
2. A binomial random variable ${S}_{n}$ takes only integer values between $0$ and $n$ inclusive and
$ℙ\left[{S}_{n}=k\right]=\left(\genfrac{}{}{0.0pt}{}{n}{k}\right){p}^{k}{\left(1-p\right)}^{n-k}$

for $k=0,1,2,\dots ,n$.

3. The expectation of a binomial random variable with $n$ trials and probability of success $p$ on each trial is:
$𝔼\left[{S}_{n}\right]=np$

4. The variance of a binomial random variable with $n$ trials and probability of success $p$ on each trial is:
$Var\left[{S}_{n}\right]=npq=np\left(1-p\right)$

__________________________________________________________________________

### Vocabulary

1. An elementary experiment is a physical experiment with two outcomes. An elementary experiment is also called a Bernoulli trial.
2. A composite experiment consists of repeating an elementary experiment $n$ times.
3. The sample space, denoted ${\Omega }_{n}$ is the set of all possible sequences of $n$ $0$s and $1$s representing all possible outcomes of the composite experiment.
4. A random variable is a function from the sample space ${\Omega }_{n}$ to the real numbers $ℝ$.

__________________________________________________________________________

### Mathematical Ideas

#### Sample Space for a Sequence of Experiments

An elementary experiment in this section consists of an experiment with two outcomes. An elementary experiment is also called a Bernoulli trial. Label the outcomes of the elementary experiment $1$, occurring with probability $p$ and $0$, occurring with probability $q$, where $p+q=1$. Often we name $1$ as success and $0$ as failure. For example, a coin toss would be a physical experiment with two outcomes, say with “heads” labeled as success, and “tails” as failure.

A composite experiment consists of repeating an elementary experiment $n$ times. The sample space, denoted ${\Omega }_{n}$ is the set of all possible sequences of $n$ 0’s and 1’s representing all possible outcomes of the composite experiment. We denote an element of ${\Omega }_{n}$ as $\omega =\left({\omega }_{1},\dots ,{\omega }_{n}\right)$, where each ${\omega }_{k}=0$ or $1$. That is, ${\Omega }_{n}={\left\{0,1\right\}}^{n}$. We assign a probability measure $ℙn\left[\cdot \right]$ on ${\Omega }_{n}$ by multiplying probabilities of each Bernoulli trial in the composite experiment according to the principle of independence. Thus, for $k=1,\dots ,n$,

and inductively for each $\left({e}_{1},{e}_{2},\dots ,{e}_{n}\right)\in {\left\{1,0\right\}}^{n}$

Additionally, let ${S}_{n}\left(\omega \right)$ be the number of 1’s in $\omega \in {\Omega }_{n}$. Note that ${S}_{n}\left(\omega \right)={\sum }_{k=1}^{n}{\omega }_{k}$. We also say ${S}_{n}\left(\omega \right)$ is the number of successes in the composite experiment. Then

$ℙn\left[\omega \right]={p}^{{S}_{n}\left(\omega \right)}{q}^{n-{S}_{n}\left(\omega \right)}.$

We can also deﬁne a uniﬁed sample space $\Omega$ that is the set of all inﬁnite sequences of 0’s and 1’s. We sometimes write $\Omega ={\left\{0,1\right\}}^{\infty }$. Then ${\Omega }_{n}$ is the projection of the ﬁrst $n$ entries in $\Omega$.

A random variable is a function from a set called the sample space to the real numbers $ℝ$. For example as a frequently used special case, for $\omega \in \Omega$ let

${X}_{k}\left(\omega \right)={\omega }_{k},$

then ${X}_{k}$ is an indicator random variable taking on the value $1$ or $0$. ${X}_{k}$ (the dependence on the sequence $\omega$ is usually suppressed) indicates success or failure at trial $k$. Then as above,

${S}_{n}=\sum _{k=1}^{n}{X}_{i}=\sum _{k=1}^{n}{\omega }_{i}$

is a random variable indicating the number of successes in a composite experiment.

#### Binomial Probabilities

Proposition 1. The random variable ${S}_{n}$ takes only integer values between $0$ and $n$ inclusive and

$ℙn\left[{S}_{n}=k\right]=\left(\genfrac{}{}{0.0pt}{}{n}{k}\right){p}^{k}{q}^{n-k}.$

Remark. The notation $ℙn\left[\cdot \right]$ indicates that we are considering a family of probability measures indexed by $n$ on the sample space $\Omega$.

Proof. From the inductive deﬁnition

and inductively for each $\left({e}_{1},{e}_{2},\dots ,{e}_{n}\right)\in {\left\{1,0\right\}}^{n}$

the probability assigned to an $\omega$ having $k$ 1’s and $n-k$ 0’s is ${p}^{k}{\left(1-p\right)}^{n-k}={p}^{{S}_{n}\left(\omega \right)}{\left(1-p\right)}^{n-{S}_{n}\left(\omega \right)}$. The sample space ${\Omega }_{n}$ has precisely $\left(\genfrac{}{}{0.0pt}{}{n}{k}\right)$ such points. By the additive property of disjoint probabilities,

$ℙn\left[{S}_{n}=k\right]=\left(\genfrac{}{}{0.0pt}{}{n}{k}\right){p}^{k}{q}^{n-k}.$

and the proof is complete. □

Proposition 2. If ${X}_{1},{X}_{2},\dots {X}_{n}$ are independent, identically distributed random variables with distribution $ℙ\left[{X}_{i}=1\right]=p$ and $ℙ\left[{X}_{i}=0\right]=q$, then the sum ${X}_{1}+\cdots +{X}_{n}$ has the distribution of a binomial random variable ${S}_{n}$ with parameters $n$ and $p$.

Proposition 3.

1. $𝔼\left[{S}_{n}\right]=np$

2. $Var\left[{S}_{n}\right]=npq=np\left(1-p\right)$

Proof. First Proof: By the binomial expansion

${\left(p+q\right)}^{n}=\sum _{k=0}^{n}\left(\genfrac{}{}{0.0pt}{}{n}{k}\right){p}^{k}{q}^{n-k}.$

Diﬀerentiate with respect to $p$ and multiply both sides of the derivative by $p$:

$np{\left(p+q\right)}^{n-1}=\sum _{k=0}^{n}k\left(\genfrac{}{}{0.0pt}{}{n}{k}\right){p}^{k}{q}^{n-k}.$

Now choosing $q=1-p$,

$np=\sum _{k=0}^{n}k\left(\genfrac{}{}{0.0pt}{}{n}{k}\right){p}^{k}{\left(1-p\right)}^{n-k}=𝔼\left[{S}_{n}\right].$

For the variance, diﬀerentiate the binomial expansion with respect to $p$ twice:

$n\left(n-1\right){\left(p+q\right)}^{n-2}=\sum _{k=0}^{n}k\left(k-1\right)\left(\genfrac{}{}{0.0pt}{}{n}{k}\right){p}^{k-2}{q}^{n-k}.$

Multiply by ${p}^{2}$, substitute $q=1-p$,and expand:

$n\left(n-1\right){p}^{2}=\sum _{k=0}^{n}{k}^{2}\left(\genfrac{}{}{0.0pt}{}{n}{k}\right){p}^{k}{\left(1-p\right)}^{n-k}-\sum _{k=0}^{n}k\left(\genfrac{}{}{0.0pt}{}{n}{k}\right){p}^{k}{\left(1-p\right)}^{n-k}=𝔼\left[{S}_{n}^{2}\right]-𝔼\left[{S}_{n}\right]$

Therefore,

$Var\left[{S}_{n}\right]=𝔼\left[{S}_{n}^{2}\right]-{\left(𝔼\left[{S}_{n}\right]\right)}^{2}=n\left(n-1\right){p}^{2}+np-{n}^{2}{p}^{2}=np\left(1-p\right)$

Proof. Second proof: Use that the sum of expectations is the expectation of the sum, and apply it to the corollary with ${S}_{n}={X}_{1}+\cdots +{X}_{n}$ with $𝔼\left[{X}_{i}\right]=p$.

Similarly, use that the sum of variances of independent random variables is the variance of the sum applied to ${S}_{n}={X}_{1}+\cdots +{X}_{n}$ with $𝔼\left[{X}_{i}\right]=p\left(1-p\right)$. □

#### Examples

Example. The following example appeared in the January 20, 2017 “Riddler” puzzler on the website fivethirtyeight.com.

You and I ﬁnd ourselves indoors one rainy afternoon, with nothing but some loose change in the couch cushions to entertain us. We decide that well take turns ﬂipping a coin, and that the winner will be whoever ﬂips 10 heads ﬁrst. The winner gets to keep all the change in the couch! Predictably, an enormous argument erupts: We both want to be the one to go ﬁrst.

What is the ﬁrst ﬂippers advantage? In other words, what percentage of the time does the ﬁrst ﬂipper win this game?

First solve an easier version of the puzzle where the ﬁrst person to ﬂip a head will win. Let the person who ﬂips ﬁrst be A, and the probability that A wins by ﬁrst obtaining a head is ${P}_{A}$. Then adding the probabilities for the disjoint events that the sequence of ﬂips is H, or TTH, or TTTTH and so forth.

$\begin{array}{llll}\hfill {P}_{A}& =\frac{1}{2}+{\left(\frac{1}{2}\right)}^{2}\left(\frac{1}{2}\right)+{\left(\frac{1}{2}\right)}^{4}\left(\frac{1}{2}\right)+\dots \phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\frac{1}{2}\left(1+\frac{1}{4}+{\left(\frac{1}{4}\right)}^{2}+\dots \right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\frac{1}{2}\cdot \frac{1}{1-1∕4}=\frac{1}{2}\cdot \frac{4}{3}=\frac{2}{3}.\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

Another way to do this problem would be to use ﬁrst-step analysis from Markov Chain theory. Then the probability of the ﬁrst player winning ${P}_{A}$ is the probability of winning on the ﬁrst ﬂip plus the probability of both players each losing their ﬁrst ﬂip at which point the game is essentially starting over,

${P}_{A}=\frac{1}{2}+\frac{1}{4}{P}_{A}.$

Solving, $\frac{3}{4}{P}_{A}=\frac{1}{2}$ or

${P}_{A}=\frac{1}{2}\cdot \frac{4}{3}=\frac{2}{3}.$

Now extend the same reasoning as in the ﬁrst approach to the case of the ﬁrst player to get 10 heads winning. The ﬁrst case for A to win is to get $9$ heads in ﬂips $1,3,5,\dots ,17$ and the $10$th head on ﬂip $19$ and for player B to get anywhere from $0$ to $9$ heads on ﬂips $2,4,6,\dots ,18$. This is probability $\left(\genfrac{}{}{0.0pt}{}{9}{9}\right){\left(\frac{1}{2}\right)}^{9}\cdot \frac{1}{2}$ and cumulative binomial probability $\sum _{k=0}^{9}\left(\genfrac{}{}{0.0pt}{}{9}{k}\right){\left(\frac{1}{2}\right)}^{9}$ respectively. The next disjoint probability case is for A to win is to get $9$ heads in ﬂips $1,3,5,\dots ,19$ and the $10$th head on ﬂip $21$ and for player B to get anywhere from $0$ to $9$ heads on ﬂips $2,4,6,\dots ,20$. This is probability $\left(\genfrac{}{}{0.0pt}{}{10}{9}\right){\left(\frac{1}{2}\right)}^{10}\cdot \frac{1}{2}$ and cumulative binomial probability $\sum _{k=0}^{9}\left(\genfrac{}{}{0.0pt}{}{10}{k}\right){\left(\frac{1}{2}\right)}^{10}$ respectively. In general, the disjoint probability case is for A to win is to get $9$ heads in ﬂips $1,3,5,\dots ,2j-1$ and the $10$th head on ﬂip $2j+1$ and for player B to get anywhere from $0$ to $9$ heads on ﬂips $2,4,6,\dots ,2j$. This is probability $\left(\genfrac{}{}{0.0pt}{}{j}{9}\right){\left(\frac{1}{2}\right)}^{j}\cdot \frac{1}{2}$ and cumulative binomial probability $\sum _{k=0}^{9}\left(\genfrac{}{}{0.0pt}{}{j}{k}\right){\left(\frac{1}{2}\right)}^{j}$ respectively.

Then multiplying the independent probabilities for A and B in each case and adding all these disjoint probabilities

${P}_{A}=\sum _{j=9}^{\infty }\left(\genfrac{}{}{0.0pt}{}{j}{9}\right){\left(\frac{1}{2}\right)}^{j+1}\sum _{k=0}^{9}\left(\genfrac{}{}{0.0pt}{}{j}{k}\right){\left(\frac{1}{2}\right)}^{j}$

There does not seem to be an exact analytic or closed form expression for this probability as in the case of winning with a single head, so we need to approximate it. In the case of winning with 10 heads, ${P}_{A}\approx 0.53278$.

Example. The following example appeared in the August 5, 2018 “Riddler” puzzler on the website fivethirtyeight.com.

I may ﬂip a potentially inﬁnite number of times, always needing to ﬂip a series of $N$ heads in a row to win, where $N$ is $T+1$ and $T$ is the number of cumulative tails tossed. I win when I ﬂip the required number of heads in a row.

What are my chances of winning this game?

Some winning sequences of ﬂips are H, THH, TTHHH, THTHHH. From here the winning sequences get more numerous and more complicated. For winning with $4$ heads in a row, the shortest sequence is TTTHHHH to the longest THTHHTHHHH. But instead of considering the possibility of winning, consider instead the complementary probability of losing.

Let ${P}_{T}$ be the probability of losing when we have ﬂipped our $T$th tail. Two possibilities can happen from here. First, we could ﬂip the $T+1$ heads in a row needed to win. Second, the complementary event is that we could ﬂip another tails before getting the necessary heads to win. Then the probability ${P}_{T}$ of losing is

${P}_{T}=\left(1-\frac{1}{{2}^{T+1}}\right){P}_{T+1}.$

In other words, if we don’t ﬂip the required $T+1$ heads in a row, we have the situation where we lose with probability ${P}_{T+1}$. Then we can use this to form the product

${P}_{0}=\left(\prod _{k=0}^{T}\left(1-\frac{1}{{2}^{T+1}}\right)\right){P}_{T+1}$

The probability of winning this game is $1-{P}_{0}$. This can be obtained by multiplying all the chances throughout the game that we don’t win, and subtracting from $1$:

$1-\prod _{T=0}^{\infty }\left(1-\frac{1}{{2}^{T+1}}\right).$

This expression is approximately $0.711$, so there is about a $71.1%$ chance of winning. This value is easily approximated and veriﬁed with a script, see the exercises.

This is the value $1-\varphi \left(1∕2\right)$ using the Euler function

$\varphi \left(q\right)=\prod _{k=1}^{\infty }\left(1-{q}^{k}\right).$

This function is the basic example of a relation between combinatorics and analysis. The coeﬃcient $p\left(k\right)$ in the formal power series expansion for $1∕\varphi \left(q\right)$ gives the number of all partitions of $k$.

#### Sources

This section is adapted from: Heads or Tails, by Emmanuel Lesigne, Student Mathematical Library Volume 28, American Mathematical Society, Providence, 2005, Sections 1.2 and Chapter 4 [3]. The ﬁrst example is heavily adapted from the weekly “Riddler” column of January 20, 2017 from the website fivethirtyeight.com. The second example is adapted from the weekly “Riddler” column of August 3, 2018 from Riddler, Where on Earth is the Riddler?..

_______________________________________________________________________________________________

### Algorithms, Scripts, Simulations

#### Algorithm

The following Octave code in ineﬃcient in the sense that it generates far more trials than it needs. However, writing the code that captures exactly the number of ﬂips needed on each trial would probably take more lines, so it is easy to be ineﬃcient here.

#### Scripts

1p = 0.5;
2n = 500;
3trials = 2000;
4
5victory = 10;
6
7headsTails = ( rand(n,trials) <= p );
12
13winsA = zeros(1,trials);
14
15for j = 1:trials
16        winsA(1,j) = ( min(find(totalHeadsA(:,j) == victory)) <= min(find(totalHeadsB(:,j) == victory)) );
17    endfor;
18empirical = sum(winsA)/trials;
19
20nRange = [9:40];
21A = binopdf(9,nRange,1/2) * (1/2);
22B = binocdf(9,nRange,1/2);
23analytic = dot(A,B);
24
25disp("The empirical probability is:")
26disp(empirical)
27disp("The approximation to the analytic probabily is:")
28disp(analytic)
0cAp0x1-1200029:

_______________________________________________________________________________________________

### Problems to Work for Understanding

1. Solve the example problem for the cases of winning with $2,3,4,\dots ,9$ heads.
2. Write a simulation to experimentally simulate the coin-ﬂipping game of the example. Experimentally determine the probability of A winning in the cases of winning with $1,2,3,\dots 10$ heads.
3. Draw a graph of the probability of A winning versus the number of heads required to win.
4. Write a script to approximately evaluate
$1-\prod _{T=0}^{\infty }\left(1-\frac{1}{{2}^{T+1}}\right)\approx 0.711.$

__________________________________________________________________________

### References

[1]   Leo Breiman. Probability. SIAM, 1992.

[2]   William Feller. An Introduction to Probability Theory and Its Applications, Volume I, volume I. John Wiley and Sons, third edition, 1973. QA 273 F3712.

[3]   Emmanuel Lesigne. Heads or Tails: An Introduction to Limit Theorems in Probability, volume 28 of Student Mathematical Library. American Mathematical Society, 2005.

__________________________________________________________________________

1. Virtual Laboratories in Probability and Statistics ¿ Binomial.
2. Weisstein, Eric W. “Binomial Distribution.” From MathWorld–A Wolfram Web Resource. BinomialDistribution.

__________________________________________________________________________

I check all the information on each page for correctness and typographical errors. Nevertheless, some errors may occur and I would be grateful if you would alert me to such errors. I make every reasonable eﬀort to present current and accurate information for public use, however I do not guarantee the accuracy or timeliness of information on this website. Your use of the information from this website is strictly voluntary and at your risk.

I have checked the links to external sites for usefulness. Links to external websites are provided as a convenience. I do not endorse, control, monitor, or guarantee the information contained in any external website. I don’t guarantee that the links are active at all times. Use the links here with the same caution as you would all information on the Internet. This website reﬂects the thoughts, interests and opinions of its author. They do not explicitly represent oﬃcial positions or policies of my employer.

Information on this website is subject to change without notice.