Steven R. Dunbar
Department of Mathematics
203 Avery Hall
University of Nebraska-Lincoln
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466

Topics in
Probability Theory and Stochastic Processes
Steven R. Dunbar

__________________________________________________________________________

Series with Random Signs

_______________________________________________________________________

Note: These pages are prepared with MathJax. MathJax is an open source JavaScript display engine for mathematics that works in all browsers. See http://mathjax.org for details on supported browsers, accessibility, copy-and-paste, and other features.

_______________________________________________________________________________________________

Rating

Rating

Mathematicians Only: prolonged scenes of intense rigor.

_______________________________________________________________________________________________

Section Starter Question

Section Starter Question

What is the difference between conditional convergence and absolute convergence of a series? Give an example of a conditionally convergent series and an absolutely convergent series.

_______________________________________________________________________________________________

Key Concepts

Key Concepts

  1. The harmonic series n=11 n diverges and the alternating harmonic series n=1(1)n+1 n converges to ln 2.
  2. For independent random variables X1,X2, the sequence of partial sums j=1nX n converges in probability if and only if j=1nX n converges almost surely.
  3. Let Y k be a random variable taking on the value +1 with probability 1 2 or 1 with probability 1 2. If k=1a k2 converges, then k=1Y kak converges almost surely.

__________________________________________________________________________

Vocabulary

Vocabulary

  1. The harmonic series n=11 n diverges and the alternating harmonic series n=1(1)n+1 n converges to ln 2.
  2. Let Y k be a random variable taking on the value +1 with probability 1 2 or 1 with probability 1 2. The random harmonic series is
    n=1Y n n

__________________________________________________________________________

Mathematical Ideas

Mathematical Ideas

Alternating Harmonic Series and Rearrangements

The harmonic series n=11 n diverges and the alternating harmonic series n=1(1)n+1 n converges to ln 2.

As an aside, from a theorem due to Riemann: if α and β are extended real numbers with α β, then there exists a rearrangement of the alternating harmonic series n=1a n with partial sums snsuch that

liminf sn = α, limsup s n = β.

For example, consider the rearrangement

1 + 1 3 1 2 + 1 5 + 1 7 1 4 + 1 9 + 1 11 1 6 +

in which two positive odd-denominator terms are always followed by one negative even-denominator term. Since

1 4k 3 + 1 4k 1 1 2k = 8k 3 32k3 32k2 + 6k > 0

then s3 < s 6 < s 9 < where sn is the nth partial sum of the rearranged series. Then limsup sn > 5 6. In fact, using symbolic algebra software

k=1 8k 3 32k3 32k2 + 6k = 3 ln(2) 2 1.03972.

The dichotomy between the divergence of the harmonic series and the convergence of the alternating harmonic series along with the potential finite sum of rearrangements of the alternating harmonic series suggests consideration of random harmonic series where coin flips determine each sign.

Random Series

Let Y k be a random variable taking on the value +1 with probability 1 2 or 1 with probability 1 2. The random harmonic series is

n=1Y n n .

The convergence of the random harmonic series

n=1Y n n

is a tail event. Therefore, the probability of convergence is either 0 or 1. More generally, the question is to characterize series such that

n=1Y nan

converges almost surely. According to M. Kac, H. Steinhaus and N. Wiener posed this question independently in 1922. However, H. Rademacher had already provided the answer in a 1922 paper. Even more generally, consider the sequence of partial sums Sn = k=1nX n of independent random variables. The convergence is again a tail event. The basic question is to characterize series such that

n=1X n

converges almost surely.

Lemma 1. Let S1, S2, …, SN be successive sums of independent random variables such that sup jN |SN Sj| > α = c < 1. Then

sup jN|Sj| > 2α 1 1 c |SN| > α (1)

Remark. Breiman credits this lemma to Skorokhod. This inequality is like other inequalities in probability theory, for example, Kolmogorov’s Inequality for independent random variables such that 𝔼 Xn = 0 and 𝔼 Xn2 <

max kn|Sk| > 𝜖 1 𝜖2 1n𝔼 X k2

see [2, Theorem 5.3.1, page 118]. It is also like submartingale inequalities, [2, Theorem, 9.4.1, page 331]. If Xn is a submartingale, then for each real λ

λ max 1knXk λ max 1knXkλXn d 𝔼 Xn+ .

Under the condition 𝔼 Xn = 0, the sequence of sums is a martingale, so the submartingale inequality would apply but |SN| > α is not immediately comparable to max 1knXkλXn d, so the resemblance is about the term max 1knXk λ and the lemma is a different inequality. Breiman [1, Proposition 5.13] has a slightly weaker submartingale inequality proved with the Optional Sampling Theorem: For λ > 0

λ max 1knXk λ 𝔼 |Xn|.

Proof.

  1. For each sample point ω let j(ω) be the first j such that |Sj| > 2α. Then |SN| > α, sup jN|Sj| > 2α = j=1N |S N| > α,j = j j=1N |S N Sj| α,j = j.

    (See the problems for a justification of the last inequality.)

  2. The set j = j is in (X1,,Xj) and SN Sj is measurable with respect to (Xj+1,,XN) so the last sum on the right above is
    j=1N |S N Sj| α j = j.

  3. Use |SN Sj| α 1 c to get |SN| > α, sup jN|Sj| > 2α (1 c) j=1N j = j = (1 c) sup jN|Sj| 2α.
  4. Finally
    |SN| > α |SN| > α, sup jN|Sj| > 2α

    so

    |SN| > α (1 c) sup jN|Sj| 2α.

    Dividing by 1 c completes the proof.

Theorem 2. For independent random variables X1,X2, the sequence of partial sums j=1nX n converges in probability if and only if j=1nX n converges almost surely.

Remark. Chung credits this theorem to P. Lévy.

Proof.

Since almost sure convergence implies convergence in probability, the implication is automatic.
  1. Assume the sequence of partial sums does not converge almost surely.
  2. If a sequence sn of numbers does not converge, then there exists an 𝜖 > 0 such that for every m
    sup n>m|sn sm| > 𝜖

    so if k=1nX n diverges with positive probability then there exists an 𝜖 > 0 and a δ > 0 such that for every fixed m,

    sup n>m k=m+1nX n > 𝜖 δ.

  3. Using (1) from Lemma 1
    sup m<nN k=m+1nX k > 𝜖 1 1 Cm,N k=m+1NX k > 𝜖 2

    where

    Cm,N = sup m<nN k=nNX k > 𝜖 2 .

  4. If k=1NX k is convergent in probability, then
    k=1NX k k=1mX k = k=m+1NX k0.

    That is, as m,N ,

    k=m+1NX k > 𝜖 2 = 0.

  5. Therefore Cm,N 0. Taking N
    lim m0 sup n>m k=m+1nX k > 0 = 0.

    The contradiction proves this direction of the implication.

Corollary 1. For independent random variables X1,X2, with 𝔼 Xk = 0 for all k and k=1𝔼 X k2 < , then k=1X k converges almost surely.

Remark. This corollary provides the convergence criterion for the general random signs for series problem stated at the start of this section.

Proof. Using the independence,

𝔼 k=1X k2 2 = k=1𝔼 X k2 < .

Then by [1, equation 2.46], convergence in second moment implies convergence in probability. By Theorem 2 convergence in probability with these conditions implies almost sure convergence. □

Corollary 2. If k=1a k2 converges, then k=1Y kak converges almost surely.

The following is a partial converse of Theorem 2:

Theorem 3. Let X1,X2, be independent random variables such that 𝔼 Xk = 0 and |Xk| α < for all k. Then the almost sure convergence of k=1nX k implies k=1𝔼 X k2 < .

Proof.

  1. For any λ > 0, define n(λ)(ω) by
    n(λ)(ω) = first n such that k=1nX k λ  if no such n exists

    so that n(λ) is an extended integer random variable.

  2. For any integer N and j < N, consider
    n(λ)=j k=1NX k 2 d .

  3. Since n(λ) = j (X 1,,Xj), then by independence and since 𝔼 Xk = 0 for all k
    n(λ)=j k=1NX k 2 d =n(λ)=j k=1jX k 2 d + k=j+1Nn(λ)=jXk2 d .

  4. On the set n(λ) = j, k=1j1X k λ. Since |Xj| α, k=1jX k λ + α.
  5. By independence, for k > j
    n(λ)=jXk2 d = n(λ) = j 𝔼 X k2 .

  6. Using these two estimates,
    n(λ)=j k=1NX k 2 d (λ + α)2 n(λ) = j + n(λ) = j k=1N𝔼 X k2 .

  7. Sum on this expression from j = 1 to j = N to get
    n(λ)N k=1NX k 2 d (λ+α)2 n(λ) j+ n(λ) N k=1N𝔼 X k2 .

  8. Also n(λ)>N k=1NX k 2 d λ2 n(λ) > N (λ + α)2 n(λ) > N
  9. Add this to the previous inequality to get
    k=1N𝔼 X k2 (λ + α)2 + n(λ) N k=1N𝔼 X k2

    or

    n(λ) > N k=1N𝔼 X k2 (λ + α)2.

  10. Letting N ,
    n(λ) = k=1𝔼 X k2 (λ + α)2.

  11. But since k=1X k converges almost surely there must exist a λ such that
    sup n k=1nX k λ > 0

    implying n(λ) = and therefore k=1𝔼 X k2 < .

Analytic Proof

This subsection uses the analytic framework of Analytic Model. making essential use of Rademacher functions.

Theorem 4. If k=1a k2 converges, then k=1Y kak converges with probability 1.

Remark. The probability of the convergence of

n=1Y nan

is equivalent to finding the measure on [0, 1) of the set of t’s for which

n=1a nrn(t)

converges. The proof below adapts the proof by Kac, which in turn follows Paley and Zygmund. The proof depends on two well-known theorems from analysis and function theory.

Theorem 5 (Riesz-Fisher Theorem). Assume:

  1. n=1a k2 converges.
  2. The sequence of functions ϕ1(t),ϕ2(t), are an orthonormal sequence on [0, 1), that is,
    01ϕ j(t) ϕk(t)dt = δjk.

Then: There exists a function f(t) L2[0, 1] such that

lim n01 f(t) k=1na kϕk(t) 2dt.

Remark. Since this is a purely analytic theorem independent of probability, this important theorem is just stated without proof. Advanced analysis texts will have a statement and proof, for example [8, page 89].

Theorem 6. Assume:

  1. 01|f(t)|dt < ,

  2. Sequences αm and βm satisfy αm < t0 < βm for all m with
    lim mαm = t0 = lim mβm.

Then: For almost every t0 [0, 1)

lim m 1 βm αmαmβm f(t)dt = f(t0).

Remark. This theorem is an alternate statement of the fact that an indefinite integral is absolutely continuous. It is an advanced form of the Fundamental Theorem of Calculus. Most advanced analysis texts will have a statement and proof, for example [8, Theorem 8,17, page 176] or [6, Proposition 4.13, page 85; Theorem 5.13, page 10 6].

Proof of Theorem 4.

  1. By the Lemma on Orthogonality of Rademacher Functions in Analytic Model.
    01r j(t)rk(t)dt = 0.

  2. Therefore by the Reisz-Fisher Theorem, there exists a function f(t) L2[0, 1] such that
    lim n01 f(t) k=1na krk(t) 2dt = 0.

  3. Note first that f(t) L2[0, 1] implies that 01|f(t)|dt < .
  4. Now except for the possibility that t0 is a dyadic rational (a countable set, therefore a set of measure 0), define km and consequently define αm and βm by
    αm = km 2m < t0 < km + 1 2m = βm.

  5. Then with the Cauchy-Schwarz Inequality αmβm f(t) k=1na krk(t) dt (βm αm)12 01 f(t) k=1na krk(t) 2dt12.

  6. Therefore,
    αmβm f(t)dt = k=1a kαmβm rk(t)dt.

  7. Note that
    αmβm rk(t)dt = 0,k > m

    and

    αmβm rk(t)dt = (βm αm)rk(t0),k m.

    so

    αmβmr k(t)dt βm αm = k=1ma krk(t0).

    Then k=1ma krk(t0) converges to f(t0).

Theorem 7 (Converse). If k=1a k2 diverges, then k=1Y kak diverges with probability 1.

Remark. Again the probability that k=1Y kak converges is the measure of the set of t’s such that

k=1a krk(t) (2)

converges.

Remark. Extend the definition of rk(t) to be periodic with period 1, so rk(t + 1) = rk(t). Then rk(t) = r1(2k1t). The first claim is that if t is a point of convergence, then so is t + 1 2l for l = 0, 1, 2,. The claim follows by noting that replacing t by t + 1 2l only changes a finite number of terms of the series. Another way to say this is to note that the indicator function of the convergence set has arbitrarily small periods. A lemma of analysis below says that a function with arbitrarily small periods must be constant almost everywhere. In this case the constant is either 0 or 1. Note that the characteristic function of the set of convergence of (2) is measurable since the set of convergence of a series of measurable functions is measurable.

Lemma 8. If a bounded measurable function ϕ(t) is periodic with arbitrarily small periods, then ϕ(t) is constant.

Proof.

  1. Set I =01ϕ(t)dt = k=021k2(k+1)2 ϕ(t)dt = 2k2(k+1)2 ϕ(t)dt.
  2. Let t0 be such that k2 < t 0 < (k + 1)2. From Lemma 6, for almost every t0
    2k2(k+1)2 ϕ(t)dt = ϕ(t0).

  3. Thus ϕ(t0) = I for almost every t0. If ϕ(t) is not assumed to be bounded, then apply the argument to eiϕ(t).

Proof of Converse.

  1. To prove that k=1Y kak diverges, assume without loss of generality that lim kak = 0. Also for purposes of proof by contradiction, assume that the measure of the set of convergence is positive.
  2. Then there is a measurable function g(t) such that
    lim n k=1a krk(t) = g(t)

    almost everywhere. Then equivalently

    lim n exp iξ k=1na krk(t) = exp(iξg(t))

    almost everywhere.

  3. By Lebesgue’s Theorem on Dominated Convergence,
    lim n01 exp iξ k=1na krk(t) dt =01 exp(iξg(t))dt.

  4. But
    01 exp iξ k=1na krk(t) dt = k=1n cos(ξa k).

  5. The assumptions of k=1a k2 = and ak 0 imply (see the problems for a proof)
    lim n k=1n cos(ξa k) = 0.

  6. Therefore
    01 exp(iξg(t))dt = 0

    for every real ξ0.

  7. Now take a sequence ξn 0 with ξn0 for all n (for example, ξn = 1 n.) Then lim nξng(t) = 0, for almost every t, and so lim neξng(t) = 1, for almost every t.
  8. Again by Lebesgue’s Theorem on Dominated Convergence,
    lim n01eξng(t)dt = 1

    so 0 = 1, the desired contradiction.

  9. That is k=1Y kak diverges almost everywhere.

Distribution of the Random Harmonic Sum

Let

Sn = k=1nY k k

be the partial sums of the random harmonic series, which converges almost surely to a random variable of S by Rademacher’s theorem. The probability distribution of each summand is

μn = 1 2(δ1k + δ1k)

where δx is the Dirac delta or point mass distribution. Using the fact that the distribution of a sum of random variables is the convolution of the distribution summands, the distribution of Sn is

k=1n1 2(δ1k + δ1k)

where the preceding star indicates repeated convolution. Now take the characteristic function (or Fourier transform) of the distribution of Sn. Using the fact that the characteristic function of a convolution of two distributions is the product of the characteristic functions

[Sn] = k=1n[1 2(δ1k + δ1k)] = k=1n cos x k.

This product converges uniformly on bounded subsets as n , and using a theorem from characteristic functions (for example, [1, Theorem 8,28, page 171] or [2, Theorem 6.3.2, page 161]) the sequence of probability measures converges in distribution to a probability measure μ of the random variable S. That is,

f(t) = 1 k=1 cos x k (t) = 1 2πeitx k=1n cos x kdx = 1 2π(cos(tx) + i sin(tx)) k=1n cos x kdx = 1 π0 cos(tx) k=1n cos x kdx.

There does not seem to be a closed form of this integral representation for f(t) and so use numerical integration to evaluate an approximation of the function, see Figure 1. See the Algorithms and Scripts section for details.


Distribution of random sums

Figure 1: The distribution of random sums.

Note that the distribution of sums is symmetric about 0 as expected. The distribution is flat near the maximum and numerically close to 14. This suggests that

0 k=1 cos x kdx = π 4

although there seems to no proof of this integral.

Simulations of the Distribution of the Random Harmonic Sum

Simulation of the distribution of the random sums is easy. Figure 2 is a probability histogram of 150,000 sums of k=11000Y kk. See the Algorithms and Scripts section for details. The values of Y k are ±1 with equal probabilities 12. The probability histogram resembles the numerical simulation of the distribution.


Simulation of the distribution of sums


Figure 2: Simulation of the distribution of sums.

Sources

The aside about rearrangements is adapted from Principles of Mathematical Analysis by W. Rudin, McGraw-Hill, 1964, pages 66-67. The specific example of the odd-odd-even rearrangement is from the article “A Class of Simple Rearrangements of the Alternating Harmonic Series” by Maxim Gilulua in the American Mathematical Monthly. The purely probabilistic lemmas and theorems are adapted from Probability by Leo Breiman, Addison-Wesley, 1968, pages 45-48. The analytic proof is from Statistical Independence in Probability, Analysis and Number Theory by M. Kac. The numerical example of the distribution of sums and the simulation of the sums is adapted from the article “Cosine Sums, Fourier Transforms, and Random Series”, by Kent Morrison in The American Mathematical Monthly.

_______________________________________________________________________________________________

Algorithms, Scripts, Simulations

Algorithms, Sripts, Simulations

Algorithm

Scripts

R
1library(pracma) 
2 
3n <- 500 
4xmax <- 4 
5npts <- 201 
6 
7prodcos <-  Vectorize(function(x) {cumprod(cos(x/(1:n)))[n]}) 
8 
9xaxis <- seq(-xmax, xmax, length=npts) 
10 
11dist <- c() 
12 
13for (omega in xaxis) { 
14    invcosker <- function(x) {cos(omega*x)} 
15    integrand <- function(x) {invcosker(x) * prodcos(x) } 
16    dist <- c(dist, (1/pi) * quadinf( integrand, 0, Inf)$Q) 
17} 
18 
19plot(xaxis, dist, type="l")
2cAp1x1-1400020:
R
1k <- 150000 # is the number of trials, set to be columns 
2n <- 1000 # is tne number of terms in the series, set as rows 
3p <- 0.5 # is the probability of heads 
4 
5flips <- 2 * matrix((runif(n * k) <= p), n, k) - 1 
6harser <- 1/seq(n) 
7ranseq <- harser * flips 
8ranser <- apply(ranseq, 2, sum) 
9 
10f <- hist(ranser, breaks = "FD", plot = TRUE, probability = TRUE) 
11 
12plot(f, freq=FALSE, xlab="", main="")
2cAp2x1-1400015:

__________________________________________________________________________

Problems to Work

Problems to Work for Understanding

  1. In step 1 of Lemma 1, show that
    |SN| > α,j = j |S N Sj| α,j = j.

  2. Give a careful and detailed proof of the corollary with the main conclusion about random series: For independent random variables X1,X2, with 𝔼 Xk = 0 for all k and k=1𝔼 X k2 < , then k=1X k converges almost surely.
  3. Show that k=1a k2 = and ak 0 imply
    lim n k=1n cos(ξa k) = 0.

__________________________________________________________________________

Books

Reading Suggestion:

References

[1]   Leo Breiman. Probability. Addison Wesley, 1968.

[2]   Kai Lai Chung. A Course in Probability Theory. Academic Press, 1974.

[3]   Maxim Gilula. A class of simple rearrangements of the alternating harmonic series. The American Mathematical Monthly, 125(3):245–256, 2018.

[4]   Mark Kac. Statistical Independence in Probability, Analysis and Number Theory, volume 12 of The Carus Mathematical Monographs. Mathematical Association of America, 1959.

[5]   Kent Morrison. Cosine sums, fourier transforms, and random series. The American Mathematical Monthly, pages 716–724, October 1995.

[6]   H. L. Royden. Real Analysis. Macmillan, 1970.

[7]   Walter Rudin. Principles of Mathematical Analysis. McGraw-Hill, 1964.

[8]   Walter Rudin. Real and Complex Analysis. McGraw-Hill, 1974.

__________________________________________________________________________

Links

Outside Readings and Links:

__________________________________________________________________________

I check all the information on each page for correctness and typographical errors. Nevertheless, some errors may occur and I would be grateful if you would alert me to such errors. I make every reasonable effort to present current and accurate information for public use, however I do not guarantee the accuracy or timeliness of information on this website. Your use of the information from this website is strictly voluntary and at your risk.

I have checked the links to external sites for usefulness. Links to external websites are provided as a convenience. I do not endorse, control, monitor, or guarantee the information contained in any external website. I don’t guarantee that the links are active at all times. Use the links here with the same caution as you would all information on the Internet. This website reflects the thoughts, interests and opinions of its author. They do not explicitly represent official positions or policies of my employer.

Information on this website is subject to change without notice.

Steve Dunbar’s Home Page, http://www.math.unl.edu/~sdunbar1

Email to Steve Dunbar, sdunbar1 at unl dot edu

Last modified: Processed from LATEX source on June 26, 2018