






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Notes on the Laws of Large Numbers in the context of probability theory. It covers both the weak and strong laws, as well as their applications to triangular arrays and empirical distribution functions. The document also includes examples and proofs of various theorems and lemmas.
What you will learn
Typology: Lecture notes
1 / 12
This page cannot be seen from the preview
Don't miss anything!
Math 733-734: Theory of Probability Lecturer: Sebastien Roch
References: [Fel71, Sections V.5, VII.7], [Dur10, Sections 2.2-2.4].
Let X 1 , X 2 ,... be a sequence of RVs. Throughout we let Sn =
k≤n Xk. We begin with a straighforward application of Chebyshev’s inequality.
THM 4.1 (L^2 weak law of large numbers) Let X 1 , X 2 ,... be uncorrelated RVs, i.e., E[XiXj ] = E[Xi]E[Xj ] for i 6 = j, with E[Xi] = μ < +∞ and Var[Xi] ≤ C < +∞. Then n−^1 Sn →L 2 μ and, as a result, n−^1 Sn →P μ.
Proof: Note that
Var[Sn] = E[(Sn − E[Sn])^2 ] = E
i
(Xi − E[Xi])
i,j
E[(Xi − E[Xi])(Xj − E[Xj ])] =
i
Var[Xi],
since, for i 6 = j,
E[(Xi − E[Xi])(Xj − E[Xj ])] = E[XiXj ] − E[Xi]E[Xj ] = 0.
Hence Var[n−^1 Sn] ≤ n−^2 (nC) ≤ n−^1 C → 0 ,
that is, n−^1 Sn →L 2 μ, and the convergence in probability follows from Chebyshev.
With a stronger assumption, we get an easy strong law.
THM 4.2 (Strong Law in L^4 ) If the Xis are IID with E[X i^4 ] < +∞ and E[Xi] = μ, then n−^1 Sn → μ a.s.
Proof: Assume w.l.o.g. that μ = 0. (Otherwise translate all Xis by μ.) Then
E[S n^4 ] = E
i,j,k,l
XiXj XkXl
(^) = nE[X 14 ] + 3n(n − 1)(E[X 12 ])^2 = O(n^2 ),
where we used that E[X i^3 Xj ] = 0 by independence and the fact that μ = 0. (Note that E[X 12 ] ≤ 1 + E[X^41 ].) Markov’s inequality then implies that for all ε > 0
P[|Sn| > nε] ≤
E[S n^4 ] n^4 ε^4
= O(n−^2 ),
which is summable, and (BC1) concludes the proof.
The law of large numbers has interesting implications, for instance:
EX 4.3 (A high-dimensional cube is almost the boundary of a ball) Let X 1 , X 2 ,... be IID uniform on (− 1 , 1). Let Yi = X^2 i and note that E[Yi] = 1/ 3 , Var[Yi] ≤ E[Y (^) i^2 ] ≤ 1 , and E[Y (^) i^4 ] ≤ 1 < +∞. Then
X 12 + · · · + X n^2 n
both in probability and almost surely. In particular, this implies for ε > 0
(1 − ε)
n 3
< ‖X(n)‖ 2 < (1 + ε)
n 3
where X(n)^ = (X 1 ,... , Xn). I.e., most of the cube is close to the boundary of a ball of radius
n/ 3.
In the case of IID sequences we get the following.
THM 4.4 (Weak law of large numbers) Let (Xn)n be IID. A necessary and suf- ficient condition for the existence of constants (μn)n such that
Sn n
− μn →P 0 ,
is n P[|X 1 | > n] → 0.
In that case, the choice μn = E[X 11 |X 1 |≤n],
works.
(In particular, the WLLN does not apply for α = 0.) Also, we can compute μn in Theorem 4.4. For α = 1, note that (by the change of variables above)
μn = E[X (^1) X≤n] = e +
∫ (^) n
e
x log x
n log n
dx ∼ log log n.
Note, in particular, that μn may not have a limit.
To prove sufficiency, we use truncation. In particular, we give a weak law for triangular arrays which does not require a second moment—a result of independent interest.
THM 4.8 (Weak law for triangular arrays) For each n, let (Xn,k)k≤n be inde- pendent. Let bn with bn → +∞ and let X n,k′ = Xn,k (^1) |Xn,k |≤bn. Suppose that
∑n k=1 P[|Xn,k|^ > bn]^ →^0.
∑n k=1 Var[X
′ n,k]^ →^0.
If we let Sn =
∑n k=1 Xn,k^ and^ an^ =^
∑n k=1 E[X
′ n,k]^ then
Sn − an bn
Proof: Let S′ n =
∑n k=1 X
′ n,k. Clearly
∣∣^ Sn^ −^ an bn
∣∣ > ε
≤ P[Sn 6 = S′ n] + P
′ n −^ an bn
∣∣ > ε
For the first term, by a union bound
P[S n′ 6 = Sn] ≤
∑^ n
k=
P[|Xn,k| > bn] → 0.
For the second term, we use Chebyshev’s inequality:
S n′ − an bn
∣ > ε
Var[S n′] ε^2 b^2 n
ε^2 b^2 n
∑n
k=
Var[X n,k′ ] → 0.
Proof: (of sufficiency in Theorem 4.4) We apply Theorem 4.4 with bn = n. Note that an = nμn. Moreover,
n−^1 Var[X n,′ 1 ] ≤ n−^1 E[(X n,′ 1 )^2 ]
= n−^1
0
2 yP[|X n,′ 1 | > y]dy
= n−^1
∫ (^) n
0
2 y[P[|Xn, 1 | > y] − P[|Xn, 1 | > n]]dy
n
∫ (^) n
0
yP[|X 1 | > y]dy
since we are “averaging” a function going to 0. Details in [D]. The other direction is proved in the appendix.
Recall:
DEF 4.9 (Tail σ-algebra) Let X 1 , X 2 ,... be RVs on (Ω, F, P). Define
Tn = σ(Xn+1, Xn+2,.. .), T =
n≥ 1
Tn.
By a previous lemma, T is a σ-algebra. It is called the tail σ-algebra of the se- quence (Xn)n.
THM 4.10 (Kolmogorov’s 0 - 1 law) Let (Xn)n be a sequence of independent RVs with tail σ-algebra T. Then T is P-trivial, i.e., for all A ∈ T we have P[A] = 0 or 1. In particular, if Z ∈ mT then there is z ∈ [−∞, +∞] such that
P[Z = z] = 1.
EX 4.11 Let X 1 , X 2 ,... be independent. Then
lim sup n
n−^1 Sn and lim inf n
n−^1 Sn
are almost surely a constant.
+∑∞
n=
P[|Tk(n) − E[Tk(n)]| > εk(n)] ≤
ε^2
n=
Var[Tk(n)] k(n)^2
ε^2
n=
k(n)^2
k ∑(n)
i=
Var[Yi]
ε^2
i=
Var[Yi]
n:k(n)≥i
k(n)^2
ε^2
i=
VarYi
where the next to last line follows from the sum of a geometric series and the last line follows from the next lemma—proved later:
LEM 4.13 We have +∑∞
i=
Var[Yi] i^2
By (DOM) and (BC1), since ε is arbitrary, we have E[Yk] → μ and
Tk(n) k(n)
→ μ, a.s.
Tk(n) k(n + 1)
Tm m
Tk(n+1) k(n)
and using k(n + 1)/k(n) → α we get
1 α
E[X 1 ] ≤ lim inf m
Tm m
≤ lim sup m
Tm m
≤ αE[X 1 ].
Since α > 1 is arbitrary, we are done. But it remains to prove the lemma: Proof: By Fubini’s theorem
∑^ +∞
i=
Var[Yi] i^2
i=
E[Y (^) i^2 ] i^2
i=
i^2
0
2 yP[|Yi| > y]dy
i=
i^2
0
(^1) {y≤i} 2 yP[|Yi| > y]dy
0
2 y
i=
i^2
(^1) {y≤i}
P[|Yi| > y]dy
0
C′P[|Yi| > y]dy
≤ C′E|X 1 |,
where the second to last inequality follows by integrating. In the infinite case:
THM 4.14 (SLLN: Infinite mean case) Let X 1 , X 2 ,... be IID with E[X+ 1 ] = +∞ and E[X 1 − ] < +∞. Then
Sn n
→ +∞, a.s.
Proof: Let M > 0 and XiM = Xi ∧ M. Since E|XiM | < +∞ the SLLN applies to SMn =
i≤n X
M i. Then
lim inf n
Sn n
≥ lim inf n
SMn n
= E[XiM ] ↑ +∞,
as M → +∞ by (MON) applied to the positive part.
An important application of the SLLN:
THM 4.15 (Glivenko-Cantelli) Let (Xn)n be IID and, for x ∈ R,
Fn(x) =
n
k≤n
1 {Xk ≤ x},
Proof: For the first one, at least one of |X 1 | > t/ 2 or | X˜ 1 | > t/ 2 must be satisfied. For the second one, the following are enough
{X 1 > t + m, X˜ 1 ≤ m} ∪ {X 1 < −t − m, X˜ 1 ≥ −m},
and note that P[X 1 ≥ −m] ≥ P[X 1 ≥ m] ≥ 1 / 2.
LEM 4.19 Let {Yk}k≤n be independent and symmetric with Sn =
∑n k=1 Yk^ and Mn equal to the first term among {Yk}k≤n with greatest absolute value. Then
P[|Sn| ≥ t] ≥
P [|Mn| ≥ t]. (3)
Moreover, if the Yk’s have a common distribution F then
P[|Sn| ≥ t] ≥
(1 − exp(−n[1 − F (t) + F (−t)])). (4)
Proof: We start with the second one. Note that
P[|Mn| < t] ≤ (F (t) − F (−t))n^ ≤ exp (−n[1 − F (t) + F (−t)]).
Plug the latter into the the first statement. For the first one, note that by symmetry we can drop the absolute values. Then
P[Sn ≥ t] = P[Mn + (Sn − Mn) ≥ t] ≥ P[Mn ≥ t, (Sn − Mn) ≤ 0]. (5)
By symmetry, the four combinations (±Mn, ±(Sn − Mn)) have the same distribu- tion. Indeed Mn and Sn − Mn are not independent but their sign is because Mn is defined by its absolute value and Sn − Mn is the sum of the other variables. Hence,
P[Mn ≥ t] ≤ P[Mn ≥ t, (Sn − Mn) ≥ 0] + P[Mn ≥ t, (Sn − Mn) ≤ 0],
and the two terms on the RHS are equal. Plugging this back into (5), we are done.
Going back to the proof of necessity: Proof:(of necessity in Theorem 4.4) Assume that there is μn such that for all ε > 0
P[|Sn − nμn| ≥ εn] → 0.
Note that S n◦ = (Sn − nμn)◦^ =
k≤n
X k◦.
Therefore, by (1), assuming w.l.o.g. m ≥ 0 ,
P[|Sn − nμn| ≥ εn] ≥
P[|S n◦| ≥ 2 εn]
≥
(1 − exp (−nP[|X 1 ◦ | ≥ 2 nε]))
1 − exp
nP[|X 1 | ≥ 2 nε + m]
1 − exp
nP[|X 1 | ≥ n]
for ε small enough and n large enough. Since the LHS goes to 0 , we are done.
EX 4.20 (St-Petersburg paradox) Consider an IID sequence with
P
X 1 = 2j^
= 2−j^ , ∀j ≥ 1.
Clearly E[X 1 ] = +∞. Note that
P[|X 1 | ≥ n] = Θ
n
(indeed it is a geometric series and the sum is dominated by the first term) and therefore we cannot apply the WLLN. Instead we apply the WLLN for triangular arrays to a properly normalized sum. We take Xn,k = Xk and bn = n log 2 n. We check the two conditions. First
∑^ n
k=
P[|Xn,k| > bn] = Θ
n n log 2 n
To check the second one, let X n,k′ = Xn,k (^1) |Xn,k |≤bn and note
E[(X′ n,k)^2 ] =
log 2 n+log ∑ 2 log 2 n
j=
22 j^2 −j^ ≤ 2 · 2 log^2 n+log^2 log^2 n^ = 2n log 2 n.
So 1 b^2 n
∑n
k=
E[(X′ n,k)^2 ] =
2 n^2 log 2 n n^2 (log 2 n)^2