Steven R. Dunbar
Department of Mathematics
203 Avery Hall
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466

Topics in
Probability Theory and Stochastic Processes
Steven R. Dunbar

__________________________________________________________________________

Local Limit Theorems

_______________________________________________________________________

Note: These pages are prepared with MathJax. MathJax is an open source JavaScript display engine for mathematics that works in all browsers. See http://mathjax.org for details on supported browsers, accessibility, copy-and-paste, and other features.

_______________________________________________________________________________________________

Rating

Mathematicians Only: prolonged scenes of intense rigor.

_______________________________________________________________________________________________

Section Starter Question

Consider a binomial probability value for a large value of the binomial parameter $n$. How would you approximate this probability value? Would you expect this approximation to be uniformly good for all values of $k$? Explain why or why not?

_______________________________________________________________________________________________

Key Concepts

1. This section gives two estimates of binomial probability values for a large value of the binomial parameter $n$ in terms of the normal density. Each estimate is uniform in $k$. These estimates are two forms of the Local Limit Theorem.

__________________________________________________________________________

Vocabulary

1. A local limit theorem describes how the probability mass function of a sum of independent discrete random variables at a particular value approaches the normal density.

__________________________________________________________________________

Mathematical Ideas

Introduction

A local limit theorem describes how the probability mass function of a sum of independent discrete random variables approaches the normal density. We observe that the histogram of a sum of independent Bernoulli random variables resembles the normal density. From the Central Limit Theorem, we see that in standard units the area under the one bar of a binomial histogram may be approximated by the area under a standard normal. Theorems which compare the probability of a discrete random variable in terms of the area under a bar of a histogram to the area under a normal density are often called local limit theorems. This is illustrated in Figure 1

In a way, Pascal laid the foundation for the local limit theorem when he formulated the binomial probability distribution for a Bernoulli random variable with $p=1∕2=q$. James Bernoulli generalized the distribution to the case where $p\ne 1∕2$. De Moivre proved the ﬁrst real local limit theorem for the case $p=1∕2=q$ in essentially the form of Lemma 9 in The de Moivre-Laplace Central Limit Theorem.. Laplace provided a correct proof for the case with $p\ne 1∕2$. De Moivre then used the local limit theorem to add up the probabilities that ${S}_{n}$ is in an interval of length of order $\sqrt{n}$ to prove the Central Limit Theorem. See Lemma 10 and following in The de Moivre-Laplace Central Limit Theorem.. Khintchine, Lyanpunov, and Lindeberg proved much more general versions of the Central Limit Theorem using characteristic functions and Fourier transform methods. Historically, the original Local Limit Theorem was overshadowed by the Central Limit Theorem, and forgotten until its revival by Gnedenko in 1948, [1].

First Form of Local Limit Theorem

Recall that ${X}_{k}$ is a Bernoulli random variable taking on the value $1$ or $0$ with probability $p$ or $1-p$ respectively. Then

${S}_{n}=\sum _{k=1}^{n}{X}_{i}$

is a binomial random variable indicating the number of successes in a composite experiment.

In this section, we study the size of ${ℙ}_{n}\left[{S}_{n}=k\right]$ as $n$ approaches inﬁnity. We will give an estimate uniform in $k$ that is a form of the local limit theorem.

Theorem 1 (Local Limit Theorem, First Form).

${ℙ}_{n}\left[{S}_{n}=k\right]=\frac{1}{\sqrt{2\pi p\left(1-p\right)n}}\left(exp\left(\frac{-{\left(k-np\right)}^{2}}{2np\left(1-p\right)}\right)+o\left(1\right)\right).$

uniformly for all $k\in ℤ$.

Remark. Note that this is stronger than Lemma 9, the de Moivre-Laplace Binomial Point Mass Limit in the section on the de Moivre-Laplace Central Limit Theorem.. It is stronger in that here the estimate is uniform for all $k\in ℤ$ instead of just an interval of order $\sqrt{n}$ around the mean.

Proof.

1. Take $t=\frac{7}{12}$. Note that $\frac{1}{2}.

1. Proposition 1 (the Optimization Extension of the Central Limit Theorem) in The Moderate Deviations Result., says
${ℙ}_{n}\left[{S}_{n}=k\right]=\frac{1}{\sqrt{2\pi p\left(1-p\right)n}}exp\left(\frac{-{\left(k-np\right)}^{2}}{2np\left(1-p\right)}\right)\left(1+{\delta }_{n}\left(k\right)\right),$

where

$\underset{n\to \infty }{lim}\underset{|k-np|<{n}^{7∕12}}{max}|{\delta }_{n}\left(k\right)|=0.$

2. Rewriting this, we have $\begin{array}{llll}\hfill {ℙ}_{n}\left[{S}_{n}=k\right]& =\frac{1}{\sqrt{2\pi p\left(1-p\right)n}}exp\left(\frac{-{\left(k-np\right)}^{2}}{2np\left(1-p\right)}\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & \phantom{\rule{2em}{0ex}}+\frac{1}{\sqrt{2\pi p\left(1-p\right)}}exp\left(\frac{-{\left(k-np\right)}^{2}}{2np\left(1-p\right)}\right)\frac{{\delta }_{n}\left(k\right)}{\sqrt{n}},\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

where the last term is ${o}_{u}\left({n}^{-1∕2}\right)$ uniformly in $k$ when $|k-np|<{n}^{7∕12}$. Since

$\frac{1}{\sqrt{2\pi p\left(1-p\right)}}exp\left(\frac{-{\left(k-np\right)}^{2}}{2np\left(1-p\right)}\right)\le 1$

we can simplify this to

$\begin{array}{llll}\hfill {ℙ}_{n}\left[{S}_{n}=k\right]& =\frac{1}{\sqrt{2\pi p\left(1-p\right)n}}exp\left(\frac{-{\left(k-np\right)}^{2}}{2np\left(1-p\right)}\right)+\frac{{\delta }_{n}\left(k\right)}{\sqrt{n}}.\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$
2. Theorem 8.1 (the Moderate Deviations Theorem) says that if we take ${a}_{n}=\frac{{n}^{1∕12}}{\sqrt{p\left(1-p\right)}}$, then $\underset{n\to \infty }{lim}{a}_{n}=+\infty$ and $\underset{n\to \infty }{lim}{a}_{n}{n}^{-1∕6}=0$. Thus, $\begin{array}{llll}\hfill {ℙ}_{n}\left[{S}_{n}\ge np+{n}^{7∕12}\right]& ={ℙ}_{n}\left[\frac{{S}_{n}}{n}-p\ge {n}^{-5∕12}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & ={ℙ}_{n}\left[\frac{{S}_{n}}{n}-p\ge \sqrt{p\left(1-p\right)}\cdot \frac{{n}^{1∕12}∕\sqrt{p\left(1-p\right)}}{{n}^{1∕2}}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & \sim \frac{\sqrt{p\left(1-p\right)}}{{n}^{1∕12}\sqrt{2\pi }}\cdot exp\left(\frac{-{n}^{1∕6}}{2p\left(1-p\right)}\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

where the last asymptotic limit comes from the Moderate Deviations Theorem on $|k-np|<{n}^{7∕12}$.

3. The asymptotic limit implies a weaker estimate:
${ℙ}_{n}\left[{S}_{n}\ge np+{n}^{7∕12}\right]=o\left({n}^{-1∕2}\right).$

To see this, note that

$\frac{{ℙ}_{n}\left[{S}_{n}\ge np+{n}^{7∕12}\right]}{{n}^{-1∕2}}=\frac{{ℙ}_{n}\left[{S}_{n}\ge np+{n}^{7∕12}\right]}{\frac{\sqrt{p\left(1-p\right)}}{{n}^{1∕12}\sqrt{2\pi }}\cdot exp\left(\frac{-{n}^{1∕6}}{2p\left(1-p\right)}\right)}\cdot \left(\frac{\frac{\sqrt{p\left(1-p\right)}}{{n}^{1∕12}\sqrt{2\pi }}\cdot exp\left(\frac{-{n}^{1∕6}}{2p\left(1-p\right)}\right)}{{n}^{-1∕2}}\right).$

Note that the ﬁrst factor goes to $1$ by step 2. So consider ${n}^{5∕12}exp\left(\frac{-{n}^{1∕6}}{2p\left(1-p\right)}\right)\to 0$. This estimate is uniform in $k$ for $k-np\ge {n}^{7∕12}$.

4. Note that
$exp\left(\frac{-{\left(k-np\right)}^{2}}{2np\left(1-p\right)}\right)\ge exp\left(\frac{-{n}^{14∕12}}{2np\left(1-p\right)}\right)=exp\left(\frac{-{n}^{1∕6}}{2p\left(1-p\right)}\right)$

so $exp\left(\frac{-{\left(k-np\right)}^{2}}{2np\left(1-p\right)}\right)=o\left(1\right)$.

5. Using step 4, we see that
$\frac{1}{\sqrt{2\pi p\left(1-p\right)n}}exp\left(\frac{-{\left(k-np\right)}^{2}}{2np\left(1-p\right)}\right)=o\left({n}^{-1∕2}\right)$

uniformly in $k$ with $k-np\ge {n}^{7∕12}$.

6. Make step 3 look the same as step 1b by putting step 5 into step 3 without disturbing the estimate:
${ℙ}_{n}\left[{S}_{n}=k\right]=\frac{1}{\sqrt{2\pi p\left(1-p\right)n}}exp\left(\frac{-{\left(k-np\right)}^{2}}{2np\left(1-p\right)}\right)+o\left({n}^{-1∕2}\right).$

Finally, factor out $\frac{1}{\sqrt{2\pi p\left(1-p\right)n}}$ and the proof is ﬁnished

The Second Form of the Local Limit Theorem

Recall that ${Y}_{i}$ is a sequence of independent random variables which take values $1$ with probability $1∕2$ and $-1$ with probability $1∕2$. This is a mathematical model of a fair coin ﬂip game where a $1$ results from “heads” on the $i$th coin toss and a $-1$ results from “tails”. Let ${H}_{n}$ and ${L}_{n}$ be the number of heads and tails respectively in $n$ ﬂips. Then ${T}_{n}={\sum }_{i=1}^{n}{Y}_{i}={H}_{n}-{L}_{n}=2{S}_{n}-n$ counts the diﬀerence between the number of heads and tails, an excess of heads if positive. The second form of the local limit theorem is useful for estimating the probability that ${T}_{n}$ takes a value close to its average $\left(2p-1\right)n$.

Remark. The following version of Stirling’s Formula.follows from the statement of the First Form of the Local Limit Theorem. However, note that the proof of the Local Limit Theorem uses the Moderate Deviations Theorem. The proof of the Moderate Deviations Theorem uses the Optimization Extension of the de Moivre-Laplace Central Limit Theorem. Step 1 of the proof of the Optimization Extension uses Stirling’s Formula. So this is not a new proof of Stirling’s Formula for binomial coeﬃcients. It is a long way to derive the asymptotics for binomial coeﬃcients from the usual Stirling Formula.

Corollary 1.

$\left(\genfrac{}{}{0.0pt}{}{n}{k}\right)=\sqrt{\frac{2}{\pi }}\frac{{2}^{n}}{\sqrt{n}}exp\left(-\frac{2}{n}{\left(k-n∕2\right)}^{2}+o\left(1\right)\right)$

uniformly for $k\in ℤ$.

Proof. Use $p=1∕2$ in the First Form of the Local Limit Theorem. □

Theorem 2 ( Local Limit Theorem, Second Form). Set ${T}_{n}=2{S}_{n}-n$, then

${ℙ}_{n}\left[{T}_{n}=k\right]=\sqrt{\frac{2}{\pi }}{\left(\frac{p}{1-p}\right)}^{k∕2}\frac{{\left(2\sqrt{p\left(1-p\right)}\right)}^{n}}{\sqrt{n}}\left(exp\left(\frac{-{k}^{2}}{2n}\right)+o\left(1\right)\right)$

uniformly for $k\in ℤ$ such that $n+k$ is even.

Proof.

Remark. Note that the First Form and the Second Form of the Local Limit Theorem say the same thing in the symmetric case $p=1∕2$.

Corollary 2. Let $K$ be a ﬁxed ﬁnite subset of $ℤ$ that contains at least one even number and one odd number. Then

$\underset{n\to \infty }{lim}\frac{1}{n}ln\left({ℙ}_{n}\left[{T}_{n}\in K\right]\right)=ln\left(2\sqrt{p\left(1-p\right)}\right).$

Proof. The Second Form of the Local Limit Theorem says that the probability that ${T}_{n}$ is in a ﬁxed ﬁnite subset of $ℤ$ decreases exponentially as $n$ approaches inﬁnity. □

Discussion and Extensions

As the Second Form of the Local Limit Theorem indicates, to state a Local Limit Theorem for sums of random variables more general than Bernoulli random variables will take some care. For example, if the summands ${X}_{1},{X}_{2},\dots$ are all even valued, there is no way the sum ${X}_{1}+\cdots +{X}_{n}$ can be odd so a Local Limit Theorem will require at least an additional hypothesis while the Central Limit Theorem will still hold.

Deﬁnition. We say that $h$ is the maximal span of a density $f$ if $h$ is the largest integer such that the support of $f$ is contained in the aﬃne subset $\left\{b+kh:k=\dots ,-2,-1,0,1,2,\dots \phantom{\rule{0.3em}{0ex}}\right\}$ of $ℤ$ for some $b$.

Let $\varphi \left(x\right)=\frac{1}{\sqrt{2\pi }}{e}^{-{x}^{2}∕2}$ be the standard normal density.

Theorem 3 (Gnedenko’s Local Limit Theorem). Let ${X}_{1},{X}_{2},\dots$ be independent random variables with identical density $f$ with ﬁnite mean $\mu$ and ﬁnite variance ${\sigma }^{2}$ and maximal span equal to $1$. Let ${S}_{n}={X}_{1}+\cdots +{X}_{n}$. Then

$\left(n{\sigma }^{2}\right)\left|ℙ\left[{S}_{n}=k\right]-\frac{1}{\left(n{\sigma }^{2}\right)}\varphi \left(\left(x-n\mu \right)∕\left(n{\sigma }^{2}\right)\right)\right|\to 0$

uniformly in $k$ as $n\to \infty$.

Proof. This was proved by B. V. Gnedenko in 1948, [1], using characteristic functions. □

McDonald, [3], has more technical local limit theorems and further references to other generalizations.

The following example shows that any extension of the Local Limit Theorem to nonidentical random variables illustrates is complicated by the same problem of periodicity that already appears in the Second Form of the Local Limit Theorem above.

Example. Suppose the densities ${f}_{m}$ are ${f}_{m}\left(0\right)=1∕2$, ${f}_{m}\left(1\right)=1∕{2}^{m}$, ${f}_{m}\left(2\right)=1∕2-1∕{2}^{m}$. The ﬁrst random variable will be either $0$ or $1$, each with probability $1∕2$. For $m>1$, the random variables will “essentially” diﬀer by $2$. That is, the maximal span is “essentially” $2$, but Gnedenko’s theorem will not hold. In fact, consider the histogram of the convolution of the distributions ${f}_{1},\dots {f}_{20}$ in Figure 2. This histogram is the distribution of the sum of the ﬁrst $20$ of the random variables.

Note that the distribution overall resembles the normal distribution, as expected from the Lindeberg Central Limit Theorem. However closer inspection show that the value of the distribution at symmetric distances around the mean $20$ diﬀer. That is the value of the distribution at $21$ is greater than the value of the distribution at $19$. This suggests that at least the uniform approach to the normal distribution at integer values will not hold.

Sources

This section is adapted from: Heads or Tails, by Emmanuel Lesigne, Student Mathematical Library Volume 28, American Mathematical Society, Providence, 2005, Chapter 9, [2]. The historical remarks and the generalizations of the Local Limit Theorem are adapted from [3].

_______________________________________________________________________________________________

Algorithms, Scripts, Simulations

Algorithm

The experiment is ﬂipping a coin $n$ times, and repeat the experiment $k$ times. Then check the probability of a speciﬁc value and compare to the normal probability density. Also compare the logarithmic rate of growth to the predicted rate.

Scripts

R
< 0.5
< 10000
< 1000
upperintvalue < 10
lowerintvalue < 0
winLose < 2  (array( 0+(runif(nk) <= p), dim=c(n,k)))  1
# 0+ coerces Boolean to numeric
totals < colSums(winLose)
# n..2..n binomial rv sample, size k

stddev < sqrt(p(1p)n)
prob < sum( 0+(totals == upperintvalue) )/

f1 < (sqrt(2/pi))
f2 < (p/(1p))ˆ(upperintvalue/2)
f3 < ((2sqrt(p(1p)))ˆn)/sqrt(n)
f4 < exp(upperintvalueˆ2/(2n)) )
theoretical < f1f2f3f4
cat(sprintf(”Empirical_probability:_%f_n”, prob ))
cat(sprintf(”Local_Limit_Theorem_estimate:_%f_n”, theoretical))

set < (totals >= lowerintvalue) & (totals <= upperintvalue)
theoreticallograte < log( 2sqrt(p(1p)) )
cat(sprintf(”Theoretical_log_rate:_%f_n”, theoreticallograte
Octave
p = 0.5;
n = 10000;
k = 1000;

upperintvalue = 10;
lowerintvalue = 0;

winLose = 2  (rand(n,k) <= p)  1;
# 0..n binomial rv sample, size k

stddev = sqrt(p(1p)n);
prob = sum( headsTotal == upperintvalue)/k;

f1 = (sqrt(2/pi));
f2 = (p/(1p))ˆ(upperintvalue/2);
f3 = ((2sqrt(p(1p)))ˆn)/sqrt(n);
f4 = exp(upperintvalueˆ2/(2n)) );
theoretical = f1f2f3f4;

disp(”Empirical_probability:”), disp( prob )
disp(”Local_Limit_Theorem_estimate:”), disp( theoretical )

theoreticallograte = log( 2sqrt(p(1p)) );

disp(”Theoretical_log_rate:”), disp( theoreticallograte )
Perl
use PDL::NiceSlice;
use PDL::Constants qw(PI);

$p = 0.5;$n = 10000;
$k = 1000;$upperintvalue = 10;
$lowerintvalue = 0; #note order of dims!!$winLose = 2  ( random( $k,$n ) <= $p ) 1; # n..2..n binomial r.v. sample, size k #note transpose, PDL likes x (row) direction for implicitly threaded operations$totals = $winLose>transpose>sumover;$stddev = sqrt( $p ( 1$p )  $n );$prob = ( ( $totals ==$upperintval )>sumover ) / $k;$f1          = ( sqrt( 2 / PI ) );
$f2 = ($p / ( 1  $p ) )∗∗($upperintvalue / 2 );
$f3 = ( ( 2 sqrt($p  ( 1  $p ) ) )∗∗$n ) / sqrt($n);$f4          = exp( $upperintvalue∗∗2 / ( 2$n ) ) );
$theoretical =$f1  $f2$f3  $f4; print ”Empirical_probability:_”,$prob,        ”n”;
print ”Local_Limit_Theorem_estimate:”, $theoretical, ”n”;$set = ( $totals >=$lowerintvalue ) & ( $totals <=$upperintvalue );
$logintervalprobrate = ( 1 /$n )  log( sum($set) /$k );
$theoreticallograte = log( 2 sqrt($p  ( 1  $p ) ) ); print ”Empirical_log_probability_rate:_”,$logintervalprobrate, ”n”;
print ”Theoretical_log_rate:”,            \$theoreticallograte,  ”n”;
SciPy
import scipy

p = 0.5
n = 10000
k = 1000

upperintvalue = 10
lowerintvalue = 0

winLose = 2  (scipy.random.random((n,k))<= p)  1
# Note Booleans True for Heads and False for Tails
totals = scipy.sum(winLose, axis = 0)
# n..2..n binomial r.v. sample, size k
# Note how Booleans act as 0 (False) and 1 (True)

stddev = scipy.sqrt( p  ( 1p ) n )

prob = (scipy.sum( totals == upperintvalue )).astype(’float’)/k
# Note the casting of integer type to float to get float

f1          = ( scipy.sqrt( 2 / scipy.pi ) );
f2          = ( p / ( 1  p ) )∗∗( upperintvalue / 2 );
f3          = ( ( 2  scipy.sqrt( p  ( 1  p ) ) )∗∗n ) / scipy.sqrt(n);
f4          = scipy.exp( ( upperintvalue∗∗2 / ( 2.0  n )) );
# Note the use of 2.0 to force integer arith to floating point
theoretical = f1  f2  f3  f4;

print ”Empirical_probability:_”, prob
print ”Moderate_Deviations_Theorem_estimate:”, theoretical

set = ( totals >= lowerintvalue ) & ( totals <= upperintvalue );
logintervalprobrate = ( 1 / n )  scipy.log( (scipy.sum(set)).astype(’float’) / k );
theoreticallograte = scipy.log( 2  scipy.sqrt( p  ( 1  p ) ) );
print ”Theoretical_log_rate:”,            theoreticallograte

__________________________________________________________________________

Problems to Work for Understanding

1. Modify the value of $p$ in a script, say to $p=0.51$, and investigate the convergence and the rate of convergence to the normal density function.
2. Modify the value of $n$ in a script to a sequence of values and investigate the rate of convergence of the logarithmic rate.

__________________________________________________________________________

References

[1]   B. V. Gnedenko. On the local limit theorem in the theory of probability. Uspekhi Mat. Nauk, 3:187–194, 1948.

[2]   Emmanuel Lesigne. Heads or Tails: An Introduction to Limit Theorems in Probability, volume 28 of Student Mathematical Library. American Mathematical Society, 2005.

[3]   D. R. McDonald. The local limit theorem: A historical perspective. Journal of the Iranian Statistical Society, 4(2):73–86, 2005.

__________________________________________________________________________

1. L. Breiman, Probability, 1968, Addison-Wesley Publishing, Section 10.4, page 224-227.
2. D. R. McDonald, “The Local Limit Theorem: A Historical Perspective”, Journal of the Royal Statistical Society, Vol. 4, No. 2, pp. 73-86.
3. Encyclopedia of Mathematics, Local Limit Theorems.

__________________________________________________________________________

I check all the information on each page for correctness and typographical errors. Nevertheless, some errors may occur and I would be grateful if you would alert me to such errors. I make every reasonable eﬀort to present current and accurate information for public use, however I do not guarantee the accuracy or timeliness of information on this website. Your use of the information from this website is strictly voluntary and at your risk.

I have checked the links to external sites for usefulness. Links to external websites are provided as a convenience. I do not endorse, control, monitor, or guarantee the information contained in any external website. I don’t guarantee that the links are active at all times. Use the links here with the same caution as you would all information on the Internet. This website reﬂects the thoughts, interests and opinions of its author. They do not explicitly represent oﬃcial positions or policies of my employer.

Information on this website is subject to change without notice.