Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Probability Laws: Weak & Strong, Triangular Arrays & Applications, Lecture notes of Mathematical Statistics

Notes on the Laws of Large Numbers in the context of probability theory. It covers both the weak and strong laws, as well as their applications to triangular arrays and empirical distribution functions. The document also includes examples and proofs of various theorems and lemmas.

What you will learn

  • What is the St-Petersburg paradox and how does it relate to the weak law of large numbers?
  • What is the role of symmetrization in proving the weak law of large numbers?
  • What is the strong law of large numbers in probability theory?
  • How does the weak law of large numbers apply to triangular arrays?
  • What is the weak law of large numbers in probability theory?

Typology: Lecture notes

2020/2021

Uploaded on 05/24/2021

ryangosling
ryangosling 🇺🇸

4.8

(24)

250 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Notes 4 : Laws of large numbers
Math 733-734: Theory of Probability Lecturer: Sebastien Roch
References: [Fel71, Sections V.5, VII.7], [Dur10, Sections 2.2-2.4].
1 Easy laws
Let X1, X2, . . . be a sequence of RVs. Throughout we let Sn=PknXk.
We begin with a straighforward application of Chebyshev’s inequality.
THM 4.1 (L2weak law of large numbers) Let X1, X2, . . . be uncorrelated RVs,
i.e., E[XiXj] = E[Xi]E[Xj]for i6=j, with E[Xi] = µ < +and Var[Xi]
C < +. Then n1SnL2µand, as a result, n1SnPµ.
Proof: Note that
Var[Sn] = E[(SnE[Sn])2] = E
X
i
(XiE[Xi])!2
=X
i,j
E[(XiE[Xi])(XjE[Xj])] = X
i
Var[Xi],
since, for i6=j,
E[(XiE[Xi])(XjE[Xj])] = E[XiXj]E[Xi]E[Xj]=0.
Hence
Var[n1Sn]n2(nC)n1C0,
that is, n1SnL2µ, and the convergence in probability follows from Chebyshev.
With a stronger assumption, we get an easy strong law.
THM 4.2 (Strong Law in L4)If the Xis are IID with E[X4
i]<+and E[Xi] =
µ, then n1Snµa.s.
1
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Probability Laws: Weak & Strong, Triangular Arrays & Applications and more Lecture notes Mathematical Statistics in PDF only on Docsity!

Notes 4 : Laws of large numbers

Math 733-734: Theory of Probability Lecturer: Sebastien Roch

References: [Fel71, Sections V.5, VII.7], [Dur10, Sections 2.2-2.4].

1 Easy laws

Let X 1 , X 2 ,... be a sequence of RVs. Throughout we let Sn =

k≤n Xk. We begin with a straighforward application of Chebyshev’s inequality.

THM 4.1 (L^2 weak law of large numbers) Let X 1 , X 2 ,... be uncorrelated RVs, i.e., E[XiXj ] = E[Xi]E[Xj ] for i 6 = j, with E[Xi] = μ < +∞ and Var[Xi] ≤ C < +∞. Then n−^1 Sn →L 2 μ and, as a result, n−^1 Sn →P μ.

Proof: Note that

Var[Sn] = E[(Sn − E[Sn])^2 ] = E

i

(Xi − E[Xi])

i,j

E[(Xi − E[Xi])(Xj − E[Xj ])] =

i

Var[Xi],

since, for i 6 = j,

E[(Xi − E[Xi])(Xj − E[Xj ])] = E[XiXj ] − E[Xi]E[Xj ] = 0.

Hence Var[n−^1 Sn] ≤ n−^2 (nC) ≤ n−^1 C → 0 ,

that is, n−^1 Sn →L 2 μ, and the convergence in probability follows from Chebyshev.

With a stronger assumption, we get an easy strong law.

THM 4.2 (Strong Law in L^4 ) If the Xis are IID with E[X i^4 ] < +∞ and E[Xi] = μ, then n−^1 Sn → μ a.s.

Proof: Assume w.l.o.g. that μ = 0. (Otherwise translate all Xis by μ.) Then

E[S n^4 ] = E

i,j,k,l

XiXj XkXl

 (^) = nE[X 14 ] + 3n(n − 1)(E[X 12 ])^2 = O(n^2 ),

where we used that E[X i^3 Xj ] = 0 by independence and the fact that μ = 0. (Note that E[X 12 ] ≤ 1 + E[X^41 ].) Markov’s inequality then implies that for all ε > 0

P[|Sn| > nε] ≤

E[S n^4 ] n^4 ε^4

= O(n−^2 ),

which is summable, and (BC1) concludes the proof.

The law of large numbers has interesting implications, for instance:

EX 4.3 (A high-dimensional cube is almost the boundary of a ball) Let X 1 , X 2 ,... be IID uniform on (− 1 , 1). Let Yi = X^2 i and note that E[Yi] = 1/ 3 , Var[Yi] ≤ E[Y (^) i^2 ] ≤ 1 , and E[Y (^) i^4 ] ≤ 1 < +∞. Then

X 12 + · · · + X n^2 n

both in probability and almost surely. In particular, this implies for ε > 0

P

[

(1 − ε)

n 3

< ‖X(n)‖ 2 < (1 + ε)

n 3

]

where X(n)^ = (X 1 ,... , Xn). I.e., most of the cube is close to the boundary of a ball of radius

n/ 3.

2 Weak laws

In the case of IID sequences we get the following.

THM 4.4 (Weak law of large numbers) Let (Xn)n be IID. A necessary and suf- ficient condition for the existence of constants (μn)n such that

Sn n

− μn →P 0 ,

is n P[|X 1 | > n] → 0.

In that case, the choice μn = E[X 11 |X 1 |≤n],

works.

(In particular, the WLLN does not apply for α = 0.) Also, we can compute μn in Theorem 4.4. For α = 1, note that (by the change of variables above)

μn = E[X (^1) X≤n] = e +

∫ (^) n

e

x log x

n log n

dx ∼ log log n.

Note, in particular, that μn may not have a limit.

2.1 Truncation

To prove sufficiency, we use truncation. In particular, we give a weak law for triangular arrays which does not require a second moment—a result of independent interest.

THM 4.8 (Weak law for triangular arrays) For each n, let (Xn,k)k≤n be inde- pendent. Let bn with bn → +∞ and let X n,k′ = Xn,k (^1) |Xn,k |≤bn. Suppose that

∑n k=1 P[|Xn,k|^ > bn]^ →^0.

  1. b− n^2

∑n k=1 Var[X

′ n,k]^ →^0.

If we let Sn =

∑n k=1 Xn,k^ and^ an^ =^

∑n k=1 E[X

′ n,k]^ then

Sn − an bn

→P 0.

Proof: Let S′ n =

∑n k=1 X

′ n,k. Clearly

P

[∣∣

∣∣^ Sn^ −^ an bn

∣∣ > ε

]

≤ P[Sn 6 = S′ n] + P

[∣∣

∣∣^ S

′ n −^ an bn

∣∣ > ε

]

For the first term, by a union bound

P[S n′ 6 = Sn] ≤

∑^ n

k=

P[|Xn,k| > bn] → 0.

For the second term, we use Chebyshev’s inequality:

P

[∣∣

S n′ − an bn

∣ > ε

]

Var[S n′] ε^2 b^2 n

ε^2 b^2 n

∑n

k=

Var[X n,k′ ] → 0.

Proof: (of sufficiency in Theorem 4.4) We apply Theorem 4.4 with bn = n. Note that an = nμn. Moreover,

n−^1 Var[X n,′ 1 ] ≤ n−^1 E[(X n,′ 1 )^2 ]

= n−^1

0

2 yP[|X n,′ 1 | > y]dy

= n−^1

∫ (^) n

0

2 y[P[|Xn, 1 | > y] − P[|Xn, 1 | > n]]dy

n

∫ (^) n

0

yP[|X 1 | > y]dy

since we are “averaging” a function going to 0. Details in [D]. The other direction is proved in the appendix.

3 Strong laws

Recall:

DEF 4.9 (Tail σ-algebra) Let X 1 , X 2 ,... be RVs on (Ω, F, P). Define

Tn = σ(Xn+1, Xn+2,.. .), T =

n≥ 1

Tn.

By a previous lemma, T is a σ-algebra. It is called the tail σ-algebra of the se- quence (Xn)n.

THM 4.10 (Kolmogorov’s 0 - 1 law) Let (Xn)n be a sequence of independent RVs with tail σ-algebra T. Then T is P-trivial, i.e., for all A ∈ T we have P[A] = 0 or 1. In particular, if Z ∈ mT then there is z ∈ [−∞, +∞] such that

P[Z = z] = 1.

EX 4.11 Let X 1 , X 2 ,... be independent. Then

lim sup n

n−^1 Sn and lim inf n

n−^1 Sn

are almost surely a constant.

  1. Subsequence. For α > 1 , let k(n) = [αn]. By Chebyshev’s inequality, for ε > 0 ,

+∑∞

n=

P[|Tk(n) − E[Tk(n)]| > εk(n)] ≤

ε^2

n=

Var[Tk(n)] k(n)^2

ε^2

n=

k(n)^2

k ∑(n)

i=

Var[Yi]

ε^2

i=

Var[Yi]

n:k(n)≥i

k(n)^2

ε^2

i=

VarYi

where the next to last line follows from the sum of a geometric series and the last line follows from the next lemma—proved later:

LEM 4.13 We have +∑∞

i=

Var[Yi] i^2

≤ E|X 1 | < +∞.

By (DOM) and (BC1), since ε is arbitrary, we have E[Yk] → μ and

Tk(n) k(n)

→ μ, a.s.

  1. Sandwiching. To use a sandwiching argument, we need a monotone se- quence. Note that the assumption of the theorem applies to both X 1 + and X 1 − and the result is linear so that we can assume w.l.o.g. that X 1 ≥ 0. Then for k(n) ≤ m < k(n + 1)

Tk(n) k(n + 1)

Tm m

Tk(n+1) k(n)

and using k(n + 1)/k(n) → α we get

1 α

E[X 1 ] ≤ lim inf m

Tm m

≤ lim sup m

Tm m

≤ αE[X 1 ].

Since α > 1 is arbitrary, we are done. But it remains to prove the lemma: Proof: By Fubini’s theorem

∑^ +∞

i=

Var[Yi] i^2

i=

E[Y (^) i^2 ] i^2

i=

i^2

0

2 yP[|Yi| > y]dy

i=

i^2

0

(^1) {y≤i} 2 yP[|Yi| > y]dy

0

2 y

i=

i^2

(^1) {y≤i}

P[|Yi| > y]dy

0

C′P[|Yi| > y]dy

≤ C′E|X 1 |,

where the second to last inequality follows by integrating. In the infinite case:

THM 4.14 (SLLN: Infinite mean case) Let X 1 , X 2 ,... be IID with E[X+ 1 ] = +∞ and E[X 1 − ] < +∞. Then

Sn n

→ +∞, a.s.

Proof: Let M > 0 and XiM = Xi ∧ M. Since E|XiM | < +∞ the SLLN applies to SMn =

i≤n X

M i. Then

lim inf n

Sn n

≥ lim inf n

SMn n

= E[XiM ] ↑ +∞,

as M → +∞ by (MON) applied to the positive part.

3.2 Applications

An important application of the SLLN:

THM 4.15 (Glivenko-Cantelli) Let (Xn)n be IID and, for x ∈ R,

Fn(x) =

n

k≤n

1 {Xk ≤ x},

Proof: For the first one, at least one of |X 1 | > t/ 2 or | X˜ 1 | > t/ 2 must be satisfied. For the second one, the following are enough

{X 1 > t + m, X˜ 1 ≤ m} ∪ {X 1 < −t − m, X˜ 1 ≥ −m},

and note that P[X 1 ≥ −m] ≥ P[X 1 ≥ m] ≥ 1 / 2.

LEM 4.19 Let {Yk}k≤n be independent and symmetric with Sn =

∑n k=1 Yk^ and Mn equal to the first term among {Yk}k≤n with greatest absolute value. Then

P[|Sn| ≥ t] ≥

P [|Mn| ≥ t]. (3)

Moreover, if the Yk’s have a common distribution F then

P[|Sn| ≥ t] ≥

(1 − exp(−n[1 − F (t) + F (−t)])). (4)

Proof: We start with the second one. Note that

P[|Mn| < t] ≤ (F (t) − F (−t))n^ ≤ exp (−n[1 − F (t) + F (−t)]).

Plug the latter into the the first statement. For the first one, note that by symmetry we can drop the absolute values. Then

P[Sn ≥ t] = P[Mn + (Sn − Mn) ≥ t] ≥ P[Mn ≥ t, (Sn − Mn) ≤ 0]. (5)

By symmetry, the four combinations (±Mn, ±(Sn − Mn)) have the same distribu- tion. Indeed Mn and Sn − Mn are not independent but their sign is because Mn is defined by its absolute value and Sn − Mn is the sum of the other variables. Hence,

P[Mn ≥ t] ≤ P[Mn ≥ t, (Sn − Mn) ≥ 0] + P[Mn ≥ t, (Sn − Mn) ≤ 0],

and the two terms on the RHS are equal. Plugging this back into (5), we are done.

Going back to the proof of necessity: Proof:(of necessity in Theorem 4.4) Assume that there is μn such that for all ε > 0

P[|Sn − nμn| ≥ εn] → 0.

Note that S n◦ = (Sn − nμn)◦^ =

k≤n

X k◦.

Therefore, by (1), assuming w.l.o.g. m ≥ 0 ,

P[|Sn − nμn| ≥ εn] ≥

P[|S n◦| ≥ 2 εn]

(1 − exp (−nP[|X 1 ◦ | ≥ 2 nε]))

1 − exp

nP[|X 1 | ≥ 2 nε + m]

1 − exp

nP[|X 1 | ≥ n]

for ε small enough and n large enough. Since the LHS goes to 0 , we are done.

B St-Petersburg paradox

EX 4.20 (St-Petersburg paradox) Consider an IID sequence with

P

[

X 1 = 2j^

]

= 2−j^ , ∀j ≥ 1.

Clearly E[X 1 ] = +∞. Note that

P[|X 1 | ≥ n] = Θ

n

(indeed it is a geometric series and the sum is dominated by the first term) and therefore we cannot apply the WLLN. Instead we apply the WLLN for triangular arrays to a properly normalized sum. We take Xn,k = Xk and bn = n log 2 n. We check the two conditions. First

∑^ n

k=

P[|Xn,k| > bn] = Θ

n n log 2 n

To check the second one, let X n,k′ = Xn,k (^1) |Xn,k |≤bn and note

E[(X′ n,k)^2 ] =

log 2 n+log ∑ 2 log 2 n

j=

22 j^2 −j^ ≤ 2 · 2 log^2 n+log^2 log^2 n^ = 2n log 2 n.

So 1 b^2 n

∑n

k=

E[(X′ n,k)^2 ] =

2 n^2 log 2 n n^2 (log 2 n)^2