






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Material Type: Notes; Professor: Rempala; Class: Advanced Statistical Inference; Subject: Statistics; University: Medical College of Georgia; Term: Spring 2009;
Typology: Study notes
1 / 12
This page cannot be seen from the preview
Don't miss anything!
Let {̂ θn} be a sequence of estimators of θ based on a sequence of samples {X = (X 1 , ..., Xn) : n = 1, 2 , ...}. Suppose that as n → ∞, ̂θn is asymptotically normal (AN) in the sense that
[Vn(θ)]−^1 /^2 (̂θn − θ) →d Nk(0, Ik),
where, for each n, Vn(θ) is a k × k positive definite matrix depending on θ.
If θ is one-dimensional (k = 1), then Vn(θ) is the asymptotic variance as well as the amse of θ̂n (text §2.5.2).
When k > 1, Vn(θ) is called the asymptotic covariance matrix of θ̂n and can be used as a measure of asymptotic performance of estimators.
If θ̂jn is AN with asymptotic covariance matrix Vjn(θ), j = 1, 2, and
V 1 n(θ) ≤ V 2 n(θ)
(in the sense that V 2 n(θ) − V 1 n(θ) is nonnegative definite) for all θ ∈ Θ, then ̂θ 1 n is said to be asymptotically more efficient than θ̂ 2 n.
If θ̂n is AN, it is asymptotically unbiased. If Vn(θ) = Var( θ̂n), then, under some regularity conditions, it follows (Theorem 3.3 in text) that we have the following information inequality
Vn(θ) ≥ [In(θ)]−^1 ,
where, for every n, In(θ) is the Fisher information matrix for X of size n. The information inequality may lead to an optimal estimator.
Unfortunately, when Vn(θ) is an asymptotic covariance matrix, the information in- equality may not hold (even in the limiting sense), even if the regularity conditions are satisfied.
Example 14.2.1. Example (Hodges) Let X 1 , ..., Xn be i.i.d. from N (θ, 1), θ ∈ R. Then In(θ) = n. For a fixed constant t, define
θ̂n =
X | X¯| ≥ n−^1 /^4 t X¯ | X¯| < n−^1 /^4 ,
By Proposition 3.2, all conditions in Theorem 3.3 are satisfied. It can be shown (exercise) that ̂θn is AN with Vn(θ) = V (θ)/n, where V (θ) = 1 if θ 6 = 0 and V (θ) = t^2 if θ = 0. If t^2 < 1, the information inequality does not hold when θ = 0.
However, the following result, due to Le Cam (1953), shows that, for i.i.d. Xi’s, the information inequality holds except for θ in a set of Lebesgue measure 0.
Theorem 14.2.1. Let X 1 , ..., Xn be i.i.d. from a p.d.f. fθ w.r.t. a σ-finite measure ν on (R, B), where θ ∈ Θ and Θ is an open set in Rk. Suppose that for every x in the range of X 1 , fθ(x) is twice continuously differen- tiable in θ and satisfies
∂ ∂θ
ψθ(x)dν =
∂θ
ψθ(x)dν
for ψθ(x) = fθ(x) and = ∂fθ(x)/∂θ; the Fisher information matrix
I 1 (θ) = E
∂θ
log fθ(X 1 )
∂θ
log fθ(X 1 )
is positive definite; and for any given θ ∈ Θ, there exists a positive number cθ and a positive function hθ such that E[hθ(X 1 )] < ∞ and
sup γ:‖γ−θ‖<cθ
(^2) log fγ (x) ∂γ∂γ>
∥∥ ≤ hθ(x)
for all x in the range of X 1 , where ‖A‖ =
tr(A>A) for any matrix A. If ̂θn is an estimator of θ (based on X 1 , ..., Xn) and is AN with Vn(θ) = V (θ)/n, then there is a Θ 0 ⊂ Θ with Lebesgue measure 0 such that the information inequality holds if θ 6 ∈ Θ 0.
Thus, the information inequality becomes
[∇g(θ)]>Vn(θ)∇g(θ) ≥ [ I˜n(ϑ)]−^1 ,
where I˜n(ϑ) is the Fisher information matrix about ϑ contained in X. If p = k and g is one-to-one, then
[ I˜n(ϑ)]−^1 = [∇g(θ)]>[In(θ)]−^1 ∇g(θ)
and, therefore, ϑ̂ n is asymptotically efficient if and only if ̂θn is asymptotically efficient. For this reason, in the case of p < k, ϑ̂ n is considered to be asymptotically efficient if and only if θ̂n is asymptotically efficient, and we can focus on the estimation of θ only.
It turns out that under some regularity conditions, a root of the likelihood equation (RLE), which is a candidate for an MLE, is asymptotically efficient.
Theorem 14.4.1. Assume the conditions of LeCam’s thm (Theorem14.2.1). (i) There is a sequence of estimators {̂ θn} such that
P
sn( θ̂n) = 0
→ 1 and ̂θn →p θ,
where sn(γ) = ∂ log `(γ)/∂γ. (ii) Any consistent sequence θ˜n of RLE’s is asymp- totically efficient.
Remark 14.4.1.
Note that
E
‖∇sn(γ∗) − ∇sn(θ)‖ n
≤ E max γ∈Bn(c)
‖∇sn(γ) − ∇sn(θ)‖ n
≤ E max γ∈Bn(c)
∂^2 log fγ (X 1 ) ∂γ∂γ>^
∂^2 log fθ(X 1 ) ∂θ∂θ>
which follows from (a) ∂^2 log fγ (x)/∂γ∂γ>^ is continuous in a neighborhood of θ for any fixed x; (b) Bn(c) shrinks to {θ}; and (c) for sufficiently large n,
max γ∈Bn(c)
(^2) log fγ (X 1 ) ∂γ∂γ>^
∂^2 log fθ(X 1 ) ∂θ∂θ>
∥∥ ≤ 2 hθ(X 1 )
under the regularity condition. By the SLLN (text Theorem 1.13) and Proposition 3.1, n−^1 ∇sn(θ) →a.s. −I 1 (θ) (i.e., ‖n−^1 ∇sn(θ) + I 1 (θ)‖ →a.s. 0).
These results, together with (14.2), imply that log (γ) − log
(θ) = cλ>[In(θ)]−^1 /^2 sn(θ) − [1 + op(1)]c^2 / 2. (14.4)
Note that maxλ{λ>[In(θ)]−^1 /^2 sn(θ)} = ‖[In(θ)]−^1 /^2 sn(θ)‖. Hence, (14.1) follows from (14.4) and
P
‖[In(θ)]−^1 /^2 sn(θ)‖ < c/ 4
≥ 1 − (4/c)^2 E‖[In(θ)]−^1 /^2 sn(θ)‖^2 = 1 − k(4/c)^2 = 1 −
This completes the proof of (i). (ii) Let A = {γ : ‖γ − θ‖ ≤ } for > 0. Since Θ is open, A ⊂ Θ for sufficiently small . Let {θ˜n} be a sequence of consistent RLE’s, i.e., P (sn(θ˜n) = 0 and θ˜n ∈ A) → 1 for any > 0. Hence, we can focus on the set on which sn(θ˜n) = 0 and θ˜n ∈ A. Using the mean-value theorem for vector-valued functions, we obtain
−sn(θ) =
0
∇sn
θ + t(θ˜n − θ)
dt
(θ˜n − θ).