


















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Wold decomposition theorem is part of time series lectures.
Typology: Study notes
1 / 26
This page cannot be seen from the preview
Don't miss anything!
∗
Abstract
In Chapter 7 in Bierens (2004) the Wold decomposition was motivated
by claiming that every zero-mean covariance stationary process Xt can
be written as Xt =
P∞
j=1 βj^ Xt−j^ +^ Ut,^ where^ E[Ut.Xt−j^ ] = 0^ for all j ≥ 1 , and
P∞
j=
βj Xt−j is the projection of Xt on its past. However,
in general this claim is incorrect. In this note I will give a more general
(and hopefully correct) proof of the Wold decomposition.
The fundamental projection theorem states that:
Theorem 1. Given a sub-Hilbert space S of a Hilbert space H and an
element y ∈ H, there exists a unique element by ∈ S such that ||y − by|| =
infz∈S ||y − z||. Moreover the residual u = y − by is orthogonal to any z ∈ S:
hu, zi = 0.
Proof : See for example Bierens (2004, Th. 7.A.3, p. 202).
This result is the basis for the famous Wold (1938) decomposition for
covariance stationary time series, which in its turn is the basis for time series
analysis.
∗ Thanks to Peter Boswijk (University of Amsterdam) for pointing out an error in a
previous version of this note. Moreover, the queries of the students in my graduate time
series courses have led to substantial improvements of the proof of the Wold decomposition.
The proof of the Wold decomposition in Anderson (1994) is more trans-
parent than the original proof by Wold (1938). However, rather than follow-
ing Anderson’s proof, I will in this note derive first a general Wold decompo-
sition for a regular sequence
1 in a general Hilbert space, and then specialize
this result to the Wold decomposition for covariance stationary time series.
First, we need to define sub-Hilbert spaces spanned by a sequence in a
Hilbert space, as follows.
Let {xk}
∞ k=1 be a sequence of elements of a Hilbert space^ H,^ and let
Mm = span({xj }
m j=1)
be the space spanned by x 1 , ..., xm, i.e., Mm consists of all linear combinations
of x 1 , ..., xm. Then
Lemma 1. Mm is a Hilbert space.
Proof : Without loss of generality we may assume that the m × m matrix
Σm with elements hxi, xj i , i, j = 1, ..., m, is non-singular, as otherwise we can
remove one or more xj ’s from the list {xj }
m j=1 and still span the same space.
For example, suppose that rank(Σm) = m − 1 , and let c = (c 1 , ..., cm)
0 be
the eigenvector corresponding to the zero eigenvalue. Then
Pm
j=
cj xj
c
0 Σmc = 0, hence
Pm
j=
cj xj = 0 (the latter being the zero element of Mm).
Since at least one component of c is non-zero, for example ci, we can write
xi =
Pm
j=2(cj^ /c^1 )xj^ if^ i^ = 1,
−
Pm− 1
j=1 (cj^ /cm)xj^ if^ i^ =^ m,
−
Pi− 1
j=1(cj^ /ci)xj^ −^
Pm
j=i+1(cj^ /ci)xj^ if^1 < i < m,
so that
Mm =
span({xj }
m j=2)^ if^ i^ = 1,
span({xj }
m− 1 j=1 )^ if^ i^ =^ m,
span(x 1 , ..., xi− 1 , xi+1, ..., xm) if 1 < i < m.
Now let zn =
Pm
j=1 βj,nxj^ be a Cauchy sequence in^ Mm,^ and denote
βn = (β 1 ,n, ..., βm,n)
0
. Then for each j, βj,n is a Cauchy sequence in R because
0 = lim min(n 1 ,n 2 )→∞
||zn 1 − zn 2 ||
2 = lim min(n 1 ,n 2 )→∞
m X
j=
(βj,n 1 − βj,n 2 ) xj
2
(^1) See Definition 4 below.
assumption implies that for each n we can select a zn ∈ Mn such that
lim n→∞
||bz − zn|| = 0. (1)
Let ||z − bz|| = δ and ||z − bzn|| = δn, and note that δn ≥ δ. Since
δ
2 n =^ ||z^ −^ zbn||
2 ≤ ||z − zn||
2 = ||z − zb + bz − zn||
2
= ||z − zb||
2
2
= δ
2
2
it follows from (1) that
lim n→∞
δn = δ. (2)
Recall that z = bz + u, where hu, xi = 0 for all x ∈ M∞. Hence
||bz − bzn||
2 = ||z − zbn − u||
2 = ||z − bzn||
2
2 − 2 hz − zbn, ui
= ||z − zbn||
2
2 − 2 hz, ui = δ
2 n −^ δ
2 (3)
where the last equality follows from
hz, ui − hu, ui = hbz, ui = 0 (4)
and hu, ui = ||u||
2 = δ
2
. The theorem now follows from (2) and (3). Q.E.D.
Remark 1. Although each projection bzn is a linear combination of x 1 , ..., xn,
in general the result of Theorem 2 does not imply that there exists a sequence
{θj }
∞ j=1 such that^ bz^ =^
j=
θj xj.
As a counter example, consider the Hilbert space R 0 of zero-mean ran-
dom variables with finite second moments, endowed with the inner product
hX, Y i = E[X.Y ] and associated norm and metric. Let
Xt = Vt − Vt− 1 ,
where the Vt’s are independent N(0, 1) distributed. This is clearly a zero-
mean covariance stationary process, with covariance function γ(0) = 2, γ(1) =
− 1 , γ(m) = 0 for m ≥ 2. Hence Xt ∈ R 0 for all t.
For given t, let
t− 1 −∞ =^ span^ ({Xt−m}
∞ m=1)^ ,^ M
t− 1 t−n =^ span^ (Xt−^1 , ...., Xt−n)^.
The projection Xbt,n of Xt on M
t− 1 t−n takes the form
X^ b t,n =
X^ n
j=
θn,j Xt−j
where the coefficients θn,j are the solutions of the normal equations
γ(m) =
X^ n
k=
γ(|k − m|)θn,k, m = 1, ..., n.
hence for n ≥ 3 ,
− 1 = 2 .θn, 1 − θn, 2
0 = −θn, 1 + 2θn, 2 − θn, 3
0 = −θn, 2 + 2θn, 3 − θn, 4
0 = −θn,n− 2 + 2θn,n− 1 − θn,n
0 = −θn,n− 1 + 2θn,n
The solutions of these normal equations are
θn,j =
j
n + 1
− 1 , j = 1, ...., n,
hence
X^ b t,n =
n X
j=
μ j
n + 1
Xt−j (5)
Next, let Xbt be the projection of Xt on M
t− 1 −∞,^ and suppose that there
exists a sequence {θj }
∞ j=1 such that^ Xb t =^
j=1 θj^ Xt−j^.^ Note that the latter
is merely a short-hand notation for
lim n→∞
Xb t −
X^ n
j=
θj Xt−j
2
= lim n→∞
Xb t −
X^ n
j=
θj Xt−j
If so, it follows from Theorem 2 and (5) that
0 = lim n→∞
Xn
j=
θj Xt−j −
X^ n
j=
μ j
n + 1
Xt−j
2 Projections on the span of an orthonormal
sequence
On the other hand,
Theorem 3. If a sequence {xj }
∞ j=1 in a Hilbert space^ H^ is orthonormal, i.e.,
hxi, xj i =
1 if i = j,
0 if i 6 = j,
then any projection bz of z ∈ H on span({xj }
∞
P^ j=1)^ takes the form^ bz^ = ∞ j=1 θj^ xj^ (in the sense that^ limn→∞^ ||bz^ −
Pn
j=1 θj^ xj^ ||^ = 0),^ where^ θj^ =^ hz, xj^ i
and
j=
θ
2 j <^ ∞.
Proof : Let bzn be the projection of z on span({xj }
n j=1).^ Then
||z − bzn||
2 = min c 1 ,...,cn
z −
n X
j=
cj xj
2
= min c 1 ,...,cn
||z||
2 − 2
X^ n
j=
cj hz, xj i +
X^ n
i=
X^ n
j=
cicj hxi, xj i
= min c 1 ,...,cn
||z||
2 − 2
n X
j=
cj hz, xj i +
n X
j=
c
2 j
hence,
bzn =
n X
j=
θj xj , where θj = hz, xj i. (10)
Moreover, denoting un = z − zbn, it follows from (9) and (10) that
||un||
z −
X^ n
j=
θj xj
2
= ||z||
2 − 2
X^ n
j=
θj hz, xj i +
X^ n
j=
X^ n
i=
θj θi hxj , xii
= ||z||
2 −
n X
j=
θ
2 j ≥^0 (11)
so that
Pn
j=1 θ
2 j ≤^ ||z||
2 for all n and thus
j=1 θ
2 j <^ ∞.^ Finally, it follows
from Theorem 2 that
lim n→∞
bz −
X^ n
j=
θj xj
= lim n→∞
||bz − zbn|| = 0.
3 The general Wold decomposition
Let S 1 , S 2 be a pair of subspaces of a Hilbert space H. We say that:
Definition 2. S 1 and S 2 are orthogonal, denoted by S 1 ⊥S 2 , if for each
x 1 ∈ S 1 and each x 2 ∈ S 2 , hx 1 , x 2 i = 0.
Lemma 4. Let S 1 and S 2 be sub-Hilbert spaces satisfying S 1 ⊥S 2. Then
span(S 1 , S 2 ) = {y = x 1 + x 2 : x 1 ∈ S 1 , x 2 ∈ S 2 }
is a Hilbert space.
Proof : Let yn be a Cauchy sequence in span(S 1 , S 2 ). Then yn = x 1 ,n+x 2 ,n,
where x 1 ,n ∈ S 1 and x 2 ,n ∈ S 2. Since x 1 ,n − x 1 ,m ∈ S 1 and x 2 ,n − x 2 ,m ∈ S 2 it
follows from the orthogonality condition S 1 ⊥S 2 that
||yn − ym||
2 = ||(x 1 ,n − x 1 ,m) + (x 2 ,n − x 2 ,m)||
2
= ||x 1 ,n − x 1 ,m||
2
2
+2 hx 1 ,n − x 1 ,m, x 2 ,n − x 2 ,mi
= ||x 1 ,n − x 1 ,m||
2
2 ,
hence limmin(n,m)→∞ ||yn −ym|| = 0 implies that limmin(n,m)→∞ ||x 1 ,n −x 1 ,m|| =
0 and limmin(n,m)→∞ ||x 2 ,n − x 2 ,m|| = 0. Because S 1 and S 2 are Hilbert spaces
there exist an x 1 ∈ S 1 and an x 2 ∈ S 2 such that limn→∞ ||x 1 ,n − x 1 || = 0 and
limn→∞ ||x 2 ,n − x 2 || = 0, hence limn→∞ ||yn − y|| = 0, where y = x 1 + x 2 ∈
span(S 1 , S 2 ). Q.E.D.
in the sense that limn→∞ kx − w −
Pn
k=1 αkekk^ = 0,^ where^ {ek}
∞ k=1 is an
orthonormal sequence in M∞, αk = hx, eki ,
k=
α
2 k <^ ∞,^ and
w ∈ S∞ ∩ U
⊥ ∞,^ (13)
with S∞ = ∩
∞ n=1span({xk}
∞ k=n)^ and^ U
⊥ ∞ the orthogonal complement of^ U∞^ =
span({ek}
∞ k=1).^ Note that^ (13)^ implies that^ w^ is orthogonal to all the^ ek’s:
hek, wi = 0 for k = 1, 2 , 3 , ....
Proof : Denote
Sn = span({xk}
∞ k=n).
Note that M∞ = S 1. Project each xk on Sk+1, so that xk = bxk + uk with
projection bxk ∈ Sk+1 and residual uk. Recall that by the regularity condition,
||uk|| > 0 , hence ek = uk/||uk|| is well defined. It is not hard to verify that
the residuals uk are orthogonal, so that the ek’s are orthonormal, and that
U∞ ⊂ M∞. It follows now from Theorem 3 that (12) holds with αk = hx, eki , P∞
k=1 α
2 k <^ ∞, and^ w^ ∈^ U
⊥ ∞,^ where the latter follows from the fact that^ w
the residual of the projection of x on U∞. Therefore, the actual contents of
Theorem 4 is that w ∈ S∞.
The theorem under review will be proved in six steps:
Step 1. As before, let Mn = span({xk}
n k=1).^ I will show^ first that
Mn ⊂^ span^
Un, U
⊥ n ∩^ S^2
[c.f. Lemma 4], where Un = span(e 1 , ..., en) = span(u 1 , ..., un) and U
⊥ n is the
orthogonal complement of Un.
Proof. Let z ∈ Mn be arbitrary. Recall that z takes the form z = Pn
k=1 ckxk.^ Substituting^ xk^ =^ bxk^ +^ uk^ =^ bxk^ +^ ||uk||ek^ we can write^ z^ as
z =
n X
k=
ck (bxk + uk) =
n X
k=
ckuk +
n X
k=
ck xbk
n X
k=
ck||uk||ek +
n X
k=
ck bxk
n X
k=
ck||uk||ek + z 2
where
z 2 =
n X
k=
ck bxk
Note that
z 2 =
n X
k=
ck bxk ∈ S 2 (15)
because xbk ∈ Sk+1 ⊂ S 2 for k = 1, 2 , ..., n.
Next, project z 2 on Un. This projection takes the form
bpn =
n X
k=
dkek, where dk = hz 2 , eki ,
with residual
wn+1 ∈ U
⊥ n.^ (16)
However, e 1 is orthogonal to any element of S 2 , and z 2 ∈ S 2. Therefore,
d 1 = hz 2 , e 1 i = 0 and thus
bpn =
X^ n
k=
dkek ∈ span ({ek}
n k=2)^ ⊂^ S^2 ,
where the latter follows from ek ∈ Sk ⊂ S 2 for k = 2, 3 , ..., n. Because
wn+1 = z 2 − pbn where both terms are elements of S 2 , it follows that
wn+1 ∈ S 2. (17)
Combining (16) and (17) now yields
wn+1 ∈ U
⊥ n ∩^ S^2.
Thus, denoting α 1 = c 1 ||u 1 ||, αk = ck||uk|| + dk for k = 2, 3 , ..., n, we can
write
z =
n X
k=
αkek + wn+1, where wn+1 ∈ U
⊥ n ∩^ S^2 ,
hence z ∈ span(Un, U
⊥ n ∩^ S^2 ).^ This proves (14).
Step 2. Next, it will be shown that
span
Un, U
⊥ n ∩^ S^2
= span
Un, U
⊥ n ∩^ Sn+
which implies (20).
However, Sn+1,m ⊂ S 2 ,m and therefore
⊥ n ∩^ Sn+1,m^ ⊂^ U
⊥ n ∩^ S^2 ,m.^ (21)
Combining (20) and (21) now yields
⊥ n ∩^ S^2 ,m^ =^ U
⊥ n ∩^ Sn+1,m^ for^ m > n,
which in its turn implies that
⊥ n ∩^
∞ m=n+1S^2 ,m
⊥ n ∩^
∞ m=n+1Sn+1,m
Finally, note that S 2 = ∪
∞ m=n+1S^2 ,m^ and^ Sn+1^ =^ ∪
∞ m=n+1Sn+1,m,^ hence it
follows from (22) that (18) holds.
Step 3. Denote Rn = span
Un, U
⊥ n ∩^ Sn+
. Then
∞ n=1Rn.^ (23)
Proof. It follows from (19) that Mn ⊂ Rn, hence
∞ n=1Mn^ ⊂ ∪
∞ n=1Rn.^ (24)
However, we also have Rn ⊂ M∞, as is not hard to verify, hence
∞ n=1Rn^ ⊂^ M∞.^ (25)
Thus, the result (23) follows from (24) and (25).
Step 4. For an x ∈ M∞, let bxn be the projection of x on Rn. Then
bxn =
n X
j=
αj ej + wn+1 (26)
where αj = hx, ej i and wn+1 is the projection of x on U
⊥ n ∩^ Sn+1.^ Moreover,
j=
α
2 j <^ ∞.^ (27)
Furthermore,
lim n→∞
x −
n X
j=
αj ej − wn+
Proof. By the definition of Rn and Lemma 4, bxn =
Pn
j=1 θj^ ej^ +^ w^ for some
constants θj and a w ∈ U
⊥ n ∩^ Sn+1.^ To determine the^ θj^ ’s and^ w,^ note that
x −
X^ n
j=
θj ej − w
2
= ||x − w||
2 − 2
X^ n
j=
θj hej , xi + 2
X^ n
j=
θj hej , wi
n X
j=
θj ej
2
= ||x − w||
2 − 2
n X
j=
θj hej , xi +
n X
j=
θ
2 j
because w ∈ U
⊥ n ∩^ Sn+1^ ⊂^ U
⊥ n implies^ hej^ , wi^ = 0^ and
X^ n
j=
θj ej
2
X^ n
j=
X^ n
i=
θj θi hej , eii =
X^ n
j=
θ
2 j hej^ , ej^ i^ =
X^ n
j=
θ
2 j.
Thus
||x − bxn||
2 = inf θ 1 ,...,θn,w∈U n⊥ ∩Sn+
x −
X^ n
j=
θj ej − w
2
= inf θ 1 ,...,θn,w∈U n⊥ ∩Sn+
||x − w||
2 − 2
n X
j=
θj hej , xi +
n X
j=
θ
2 j
= inf w∈U n⊥ ∩Sn+
||x − w||
2 −
n X
j=
α
2 j
= ||x − wn+1||
2 −
n X
j=
α
2 j (29)
where αj = hx, ej i and wn+1 is the projection of x on U
⊥ n ∩^ Sn+1.
as min(m, n) → ∞. Therefore, there exists a w ∈ U
⊥ k ∩^ Sk+1^ such that
(32) holds. Since k was arbitrary we now have w ∈ ∩
∞ k=1U
⊥ k =^ U
⊥ ∞ and
w ∈ ∩
∞ k=1Sk+1^ =^ S∞,^ hence
w ∈ U
⊥ ∞ ∩^ S∞.
This completes the proof of Step 6.
The theorem now follows from (27), (31), (32) and the fact that w ∈
⊥ ∞ ∩^ S∞^ ⊂^ U
⊥ ∞,which implies that^ hw, eki^ = 0^ for^ k^ = 1,^2 ,^3 , ........^ Q.E.D.
4 The Wold decomposition for covariance sta-
tionary time series
In the case of the Hilbert space R 0 of zero-mean random variables with finite
second moments, with inner product hX, Y i = E[X.Y ] and associated norm
and metric, the results of Theorem 4 translate as follows:
Theorem 5. Let Xt be a regular univariate zero-mean covariance stationary
time series process. Then Xt can be written as
Xt =
∞ X
j=
αj Ut−j + Wt a.s., (33)
where the Ut is a zero-mean uncorrelated process with variance 1 ,
αj = E[XtUt−j ],
j=
α
2 j <^ ∞,^ (34)
and Wt is a zero-mean covariance stationary process satisfying
Wt ∈ U
⊥ t ∩^ S−∞,^ (35)
where S−∞ = ∩nspan({Xn−k}
∞ k=1)^ and^ U
⊥ t is the orthogonal complement of
Ut = span({Ut−k}
∞ k=0)^.^ The result^ (35)^ implies that
Wt ∈ span ({Wt−m}
∞ m=1)^ ,^ (36)
which in its turn implies that Wt is perfectly predictable from its past values
Wt− 1 , Wt− 2 , Wt− 3 , ..... In other words, Wt is a deterministic process. More-
over, (35) implies that
E[WtUt−m] = 0 (37)
for all leads and lags m.
Proof : Recall that Ut = Uet/
r
h
Ue^2 t
i
, where Uet = Xt − Xbt with Xbt the
projection of Xt on span({Xt−j }
∞ j=1).^ The uncorrelatedness of the^ Ue t’s follows
from Theorem 4, but we still need to show that E[ Uet] = 0 and E[ Ue
2 t ] =^ σ
2
for all t.
Proof of E[ Uet] = 0
Let Xbt,n be the projection of Xt on span({Xt−j }
n j=1).^ Then^ Xb t,n takes the
form
X^ b t,n =
n X
j=
βj,nXt−j ,
where the βj,n’s do not depend on t. The latter follows from the fact that
the βj,n’s are the solutions of the normal equations
n X
j=
βj,nγ(i − j) = γ(i), i = 1, 2 , ..., n,
where γ(i) = E [XtXt−i] is the covariance function of Xt. Hence E[ Xbt,n] = 0.
It follows from Theorem 2 that
lim n→∞
° Xbt,n − Xbt
2
= lim n→∞
Xb t,n −^ Xb t
so that by Liapounov’s inequality and E[ Xbt,n] = 0,
lim n→∞
¯E[ Xbt]
¯ = lim n→∞
¯E[ Xbt − Xbt,n]
¯ ≤ lim n→∞
h¯ ¯ ¯ Xbt − Xbt,n
i
s
lim n→∞
Xb t,n −^ Xb t
Thus E[ Xbt] = 0 and therefore E[ Uet] = E[Xt − Xbt] = 0.
Proof of (34), (35) and (37)
The result of Theorem 4 can now be translated as
lim n→∞
Xt −
X^ n
j=
αj Ut−j − Wt
where Ut is a zero-mean uncorrelated covariance stationary process with unit
variance, and αk = hXt, Ut−ki = E [XtUt−k] with
k=
α
2 k <^ ∞.
We still need to prove that the αk’s do not depend on t, as follows. Recall
from the proof of E[ Ue
2 t ] =^ σ
2 that Uet,n = Xt −
Pn
j=
βj,nXt−j , so that
h
Xt+k Uet,n
i
= γ (k) −
X^ n
j=
βj,nγ(k + j),
which does not depend on t. Moreover, by the Cauchy-Schwarz inequality
and (39),
lim n→∞
h
Xt+k
Ue t,n −^ Ue t
´i¯ ¯ ¯
2
≤ γ (0) lim n→∞
Ue t,n −^ Ue t
Thus E
h
Xt+k Uet
i
= limn→∞ E
h
Xt+k Uet,n
i
. Since the latter does not depend
on t, neither does αk = E [Xt+kUt] = E
h
Xt+k
e Ut/|| Uet||
i
.
The results (35) and (37) follow straightforwardly from Theorem 4.
Proof of (33)
The result (40) implies, by Chebyshev’s inequality, that
Xt = p lim n→∞
n X
j=
αj Ut−j + Wt. (41)
Recall that convergence in probability for n → ∞ is equivalent to a.s.
convergence along a further subsequence km of an arbitrary subsequence of
n. See for example Bierens (2004, Theorem 6.B.3, p. 168). Thus for such a
subsequence km,
Xkm
j=
αj Ut−j → Xt − Wt a.s. (42)
as m → ∞, and the same holds for any further subsequence of km.
Without loss of generality we may choose k 0 = 0. Then for each n > 0 we
can find an mn such that
kmn− 1 < n ≤ kmn. (43)
Moreover, (42) implies that
kmn X
j=
αj Ut−j → Xt − Wt a.s. as n → ∞. (44)
Due to (43),
∞ X
n=
kmn X
j=
αj Ut−j −
n X
j=
αj Ut−j
∞ X
n=
kmn X
j=n+
αj Ut−j
∞ X
n=
kmn X
j=kmn− 1 +
α
2 j ≤
∞ X
j=
α
2 j <^ ∞,
so that by Chebyshev’s inequality, for arbitrary ε > 0 ,
∞ X
n=
kmn X
j=
αj Ut−j −
n X
j=
αj Ut−j
ε
This result implies, by the Borel-Cantelli lemma,
2 that
kmn X
j=
αj Ut−j −
X^ n
j=
αj Ut−j → 0 a.s. as n → ∞. (45)
Combining (44) and (45) it follow now that
n X
j=
αj Ut−j → Xt − Wt a.s. as n → ∞. (46)
Since
j=0 αj^ Ut−j^ is defined as^ limn→∞
Pn
j=0 αj^ Ut−j^ ,^ (33) is equivalent to
(46).
(^2) See for example Bierens (2004, Theorem 6.B.2, p. 168).