lecture 6: umvues and the cramér-rao lower bound
TRANSCRIPT
Lecture 6: UMVUEs and the Cramer-Rao LowerBound
MATH 667-01Statistical Inference
University of Louisville
September 17, 2019Last modified: 9/19/2019
1 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Introduction
We start by discussing uniform minimum variance unbiasedestimators1.
We then review correlation2.
We discuss and prove the Cramer-Rao Inequality and somecorollaries3.
1CB: Section 7.3, HMC: Section 7.12CB: Section 4.5, HMC: Section 2.53CB: Section 7.3, HMC: Section 6.2
2 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Best Unbiased Estimator (UMVUE)
In this lecture, we evaluate an estimator W of a parameter θbased on the squared error loss function.
If we consider only unbiased estimators, thenEθ[(W − θ)2] = Varθ[W ].
Definition L6.1:4 An estimator W ∗ is a best unbiasedestimator of τ(θ) if it satisfies Eθ[W
∗] = τ(θ) for all θ and,for any other unbiased estimator W with Eθ[W ] = τ(θ), wehave Varθ[W
∗] ≤ Varθ[W ] for all θ.
W ∗ is also called a uniform minimum variance unbiasedestimator (UMVUE) of τ(θ).
4CB: Definition 7.3.7 on p.334; HMC: Definition 7.1.1 on p.4133 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Best Unbiased Estimator (UMVUE)
Example L6.1: Let X1, . . . , Xn be iid Poisson(λ). Both Xand S2 are unbiased estimators of λ since E[X1] = Var[X1] =λ so that E[X] = E[S2] = λ. For what values of λ is thevariance of X smaller than the variance of S2?
(Hint: Var[S2] = 1nµ4 +
µ22(n−3)n(n−1) where µj is the jth centered
moment)
4 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Best Unbiased Estimator (UMVUE)
Answer to Example L6.1: We know Var[X]3.20=
Var[X1]
n=λ
n.
We can compute E[X] = λ, E[X2] = λ2 + λ,E[X3] = λ2 + 3λ2 + λ, and E[X4] = λ4 + 6λ3 + 7λ2 + λ.
So, we have
E[(X − µ)4] = E[X4]− 4E[X3]µ+ 6E[X2]µ2 − 4E[X]µ3 + µ4
= E[X4]− 4E[X3]µ+ 6E[X2]µ2 − 3µ4
= 3λ2 + λ.
Using the hint, we see that
Var[S2] =1
n(3λ2 + λ)− λ2(n− 3)
n(n− 1)
=1
n
[λ+
2n
n− 1λ2
]so Var[X] < Var[S2] for all λ.
5 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Regularity Assumptions
Let X1, . . . , Xn be a random sample with common pdf f(x; θ) and cdf F (x; θ) forθ ∈ Θ and let W (x) = W (x1, . . . , xn) be a function. Here are some regularityassumptions5 that will be used for several upcoming theorems.
(R0) The cdfs are distinct.
(R1) The pdfs have common support for all θ.
(R2) The true parameter value θ0 is an interior point of Θ.
(R3) The pdf f(x; θ) is twice differentiable as a function of θ.
(R4) The integral∫f(x; θ) dx can be differentiated twice under the integral sign as a
function of θ.
(R5) The integral∫w(x)f(x; θ) dx can be differentiated under the integral sign as a
function of θ.
(R6) The pdf f(x; θ) is three times differentiable as a function of θ, and for allθ ∈ Θ, there exist c ∈ R and a function M(x) such that∣∣∣∣ ∂3∂θ3 ln f(x; θ)
∣∣∣∣ ≤M(x),
with Eθ0 |M(X)| <∞, for all θ0− c < θ < θ0 + c and all x in the support of X.
5slightly different conditions in HMC: p.356+362+368 and CB: p.5166 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Theorem L6.1:6 If X is a random variable with pdf f(x; θ)which satisfies regularity assumptions (R1), (R3), and (R4),then
E
[∂
∂θln f(X; θ)
]= 0
and
Eθ
[(∂
∂θln f(X; θ)
)2]
= −Eθ
[∂2
∂θ2ln f(X; θ)
].
The quantity I(θ) = Eθ
[(∂
∂θln f(X; θ)
)2]
is called the
information number or Fisher information.
6CB: Lemma 7.3.11 on p.338, HMC: page 3637 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Proof of Theorem L6.1: First, we see that
E
[∂
∂θln f(X; θ)
]=
∫ {∂
∂θln f(x; θ)
}f(x; θ) dx
=
∫ ∂∂θf(x; θ)
f(x; θ)f(x; θ) dx
=
∫∂
∂θf(x; θ) dx
=d
dθ
∫f(x; θ) dx =
d
dθ1 = 0.
8 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Proof of Theorem L6.1 continued: Note that
∂2
∂θ2[ln f(x; θ)] =
∂
∂θ
{∂∂θf(x; θ)
f(x; θ)
}
=∂2
∂θ2f(x; θ)
f(x; θ)−(∂∂θf(x; θ)
)2(f(x; θ))2
.
Then, we have
E
[∂2
∂θ2f(X;θ)
f(X; θ)
]=
∫∂2
∂θ2f(x; θ) dx
=
∫∂
∂θ
{∂
∂θf(x; θ)
}dx
9 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Proof of Theorem L6.1 continued:
=
∫∂
∂θ
{∂
∂θf(x; θ)
}dx
=
∫∂
∂θ
{∂∂θf(x; θ)
f(x; θ)f(x; θ)
}dx
=
∫∂
∂θ
{(∂
∂θln f(x; θ)
)f(x; θ)
}dx
=d
dθ
∫ {(∂
∂θln f(x; θ)
)f(x; θ)
}dx
=d
dθ
∫ { ∂∂θf(x; θ)
f(x; θ)f(x; θ)
}dx
=d
dθ
∫∂
∂θf(x; θ) dx =
d
dθ
{d
dθ
∫f(x; θ) dx
}=
d
dθ[1] = 0
10 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Proof of Theorem L6.1 continued: So, it follows that
E
[∂2
∂θ2ln f(X; θ)
]= E
[∂2
∂θ2f(X; θ)
f(X; θ)
]− E
[(∂∂θf(X; θ)
)2(f(X; θ))2
]
= 0− E
( ∂∂θf(X; θ)
f(X; θ)
)2
= −E
[(∂
∂θln f(X; θ)
)2].
11 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Review: Correlation
E[X] = µX , E[Y ] = µY , Var[X] = σ2X , Var[Y ] = σ2
Y
Assume 0 < σ2X <∞ and 0 < σ2
Y <∞Definition L6.2:7 The correlation of X and Y is the numberdefined by
ρXY =Cov[X,Y ]
σXσY.
The value ρXY is also called the correlation coefficient.
Theorem L6.2:8 For any random variables X and Y ,
(a) −1 ≤ ρXY ≤ 1.(b) |ρXY | = 1 if and only if there exists numbers a 6= 0 and b such
that P (Y = aX + b) = 1. If ρXY = 1 then a > 0, and ifρXY = −1 then a < 0.
7CB: Definition 4.5.2 on p.169, HMC: Definition 2.5.2 on p.1268CB: Theorem 4.5.7 on p.172, HMC: (a) Theorem 2.5.1 on p.127
12 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Theorem L6.3:9 Let X1, . . . , Xn be a random sample from apopulation with pdf f(x; θ). Assume that regularity conditions(R1) and (R3) – (R5) hold. Let Y = W (X1, . . . , Xn) be astatistic such that E[Y ] = k(θ). Then we have
Var[Y ] ≥ {k′(θ)}2
nI(θ).
The inequality is referred to as the Cramer-Rao inequality.
9CB: page 335, HMC: Theorem 6.2.1 on p.36513 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Proof of Theorem L6.3: Let Di = ∂∂θ ln f(Xi; θ) so that
D =∂
∂θ
{ln
n∏i=1
f(Xi; θ)
}=
n∑i=1
∂
∂θln f(Xi; θ) =
n∑i=1
Di.
Since Theorem L6.2(a) implies {Cov[Y,D]}2 ≤ Var[Y ]Var[D],it follows that
Var[Y ] ≥ {Cov [Y,D]}2
Var[D].
Since E[D] =∑n
i=1E[Di]6.7= 0, we have
Cov[Y,D] = E[Y D]
Note that we can write D =
n∑i=1
∂∂θf(Xi; θ)
f(Xi; θ).
14 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Proof of Theorem L6.3 continued: Differentiating
k(θ) =
∫· · ·∫w(x)
n∏i=1
f(xi; θ)dx1 · · · dxn,
we obtain
k′(θ) =
∫· · ·∫w(x)
∂
∂θ
n∏i=1
f(xi; θ)dx1 · · · dxn
=
∫· · ·∫w(x)
n∑i=1
∂
∂θf(xi; θ)
∏j 6=i
f(xj ; θ)
dx1 · · · dxn
=
∫· · ·∫w(x)
n∑i=1
{∂∂θf(xi; θ)
f(xi; θ)
}f(x; θ)dx1 · · · dxn
= E[Y D].
15 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Proof of Theorem L6.3 continued: Furthermore, we have
Var[D] = E[D2] = E
( n∑i=1
Di
)2
=
n∑i=1
n∑j=1
E [DiDj ]
=
n∑i=1
E[D2i
]+
n∑i=1
n∑j = 1j 6= i
E [DiDj ]
=
n∑i=1
E[D2i
]+
n∑i=1
n∑j = 1j 6= i
E[Di]E[Dj ] = nI(θ) + 0.
So, we have Var[Y ] ≥ {Cov [Y,D]}2
Var[D]={k′(θ)}2
nI(θ).
16 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Example L6.2: Let X1, . . . , Xn be iid Poisson(λ). Find theCramer-Rao lower bound on the variance of unbiasedestimators of λ. Also, find the MLE and show that it attainsthe Cramer-Rao lower bound.
Answer to Example L6.2: Since ∂2
∂λ2ln f(x;λ) =
∂2
∂λ2
[ln{λxe−λ(x!)−1
}]= ∂2
∂λ2[x lnλ− λ− ln(x!)] = − x
λ2,
we have
E
[∂2
∂λ2ln f(X;λ)
]= E
[− 1
λ2X
]= − 1
λ2E[X] = − 1
λ2λ = − 1
λ.
By Theorem L6.3,
E
[(∂
∂θln f(X; θ)
)2]
= −E
[∂2
∂λ2ln f(X;λ)
]=
1
λ.
17 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Answer to Example L6.2 continued: So the Cramer-Rao lowerbound for an unbiased estimator in the iid case is(
ddθEθ[W (X)]
)2nEθ
[(∂∂θ ln f(X; θ)
)2] =1
n(
1λ
) =λ
n.
The MLE of λ is λ = X and Var[X] =Var[X1]
n=λ
nso it
attains the CRLB.
18 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Example L6.3: Let X1, . . . , Xn be iid Normal(µ, σ2) randomvariables. Find the Cramer-Rao lower bound on unbiasedestimators of σ2. Does S2 satisfy the CRLB?Answer to Example L6.3: Since
∂2
∂(σ2)2ln f(x;µ, σ2) =
∂2
∂(σ2)2
[−1
2ln(2π)− 1
2ln(σ2)− 1
2σ2(x− µ)2
]=
1
2σ4− (x− µ)2
σ6,
Theorem L6.3 implies that
E
[(∂
∂θln f(X;µ, σ2)
)2]
= −E
[∂2
∂σ2ln f(X;µ, σ2)
]= −E
[1
2σ4− (X − µ)2
σ6
]
19 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Answer to Example L6.3 continued:
= −E
[1
2σ4− (X − µ)2
σ6
]= − 1
2σ4+
E[(X − µ)2]
σ6]
= − 1
2σ4+σ2
σ6=
1
2σ4.
Thus, the CRLB is1
nE[(
∂∂θ ln f(X; θ)
)2] =2σ4
n.
So, S2 does not satisfy the CRLB since
Var[S2] = 2σ4
n−1 = nn−1
(2σ4
n
)> 2σ4
n = CRLB.
20 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Example L6.4: Let X1, . . . , Xn be iid Uniform(0, θ) randomvariables. Find the Cramer-Rao lower bound on the varianceof unbiased estimators of θ. Also, for Y = max {X1, . . . , Xn}show that
(n+1n
)Y is an unbiased estimator which has a
smaller variance than the Cramer-Rao lower bound.
Answer to Example L6.4: Since ∂∂θ ln f(x; θ) =
∂∂θ
[ln 1
θ
]= −1
θ , the CRLB is 1n(−θ−1)2
= θ2
n .
Since the CDF of Y isF (y) = P (Y ≤ y) =
∏ni=1 P (Xi ≤ y) =
(yθ
)nfor 0 < y < θ,
the pdf of Y is f(y) = F ′(y) =nyn−1
θnI(0,θ)(y).
21 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound
Cramer-Rao Lower Bound
Answer to Example L6.4 continued:(n+1n
)Y is unbiased since
E[(n+1n
)Y]
= n+1n
∫ θ0 y
nyn−1
θn dy = n+1θn
∫ θ0 y
n dy =
n+1θn
[1
n+1yn+1]θ
0= n+1
θn
[1
n+1θn+1]
= θ.
Similarly, E[(
n+1n Y
)2]= (n+1)2
nθn
∫ θ0 y
n+1 dy =
(n+1)2
nθn
[1
n+2yn+2]θ
0= (n+1)2
nθn
[1
n+2θn+2]
= (n+1)2
n(n+2)θ2.
So, Var[(n+1n
)Y]
= (n+1)2
n(n+2)θ2 − θ2 = 1
n(n+2)θ2.
It is now seen that Var[(n+1n
)Y]
= 1n+2
(θ2
n
)< θ2
n = CRLB.
22 / 22 Lecture 6: UMVUEs and the Cramer-Rao Lower Bound