chapter 3 estimation and decision theory
TRANSCRIPT
David Shiung
Chapter 3
Estimation and Decision Theory
1
Introduction of Estimation and
Decision (1/1)
2
Suppose we want to estimate a parameter vector θ=(θ1,…,
θn)T. Suppose further that our measurements are corrupted
by random measurement errors that we call noise.
What is a good estimator for θ that is based only on the
acquired data?
This class of problems falls within the realm of estimation
theory. A second class of problems closely related to
estimation are those involved with making decisions in a
random environment
Parameter Estimation (1/3)
3
The problem of estimating the parameters μ and σ2 is a
problem of parameter estimation
θ estimation
1. θ : an unknown scalar that we wish to estimate
2. x1,.....,xn: n measurements (observations)
xi=θ+ εi i=1,…..,n
εi : the value of measurement noise on the ith observation
3. A reasonable estimation of θ
is the value of measurement
n
i
ixn 1
1
Parameter Estimation (2/3)
4
The estimate of the mean μ
1. : a normal r.v. takes in n trials
2. An estimate of μ that is the mean of the pdf :
3. the estimate of μ based on the second set
:
4. and are probably different
)1()1(
1 ,....., nxx
n
i
ixn 1
)1()1( 1
)2()2(
1 ,....., nxx
n
i
ixn 1
)2()2( 1
)1( )2(
Parameter Estimation (3/3)
5
We should call the estimate of a particular value of a r.v. as
an estimator based on the n sample values x1,…,xn
1. Each measurement can be viewed as an observation on a
generic r.v. X with pdf
2. Xi , i=1,…..,n, n i.i.d. observations on X , and each has
3. The estimate is a particular estimator :
4. The estimator in above equation is often used to estimate
E[X]
)(xfX
)()( iXiX xfxfi
n
i
iXn 1
1ˆ
Definition (1/3)
6
Definition: An estimator is a function of the observation
vector X= that estimates θ
Definition: An estimator for θ is said to be unbiased if and
only if . The bias in estimating θ with is
Definition: Let be an estimator computed from n samples
for every . Then is said to be consistent if
T
nXX ),.....,( 1
]ˆ[E
]ˆ[E
nXX ,.....,1 1n
0 , 0]ˆPr[lim
nn
n
n
Definition (2/3)
7
Definition: An estimator is called a minimum mean-square
error estimator if
where is any other estimator
])ˆ[(])ˆ[( 2'2 EE'
Definition (3/3)
8
What is the relationship between unbiasedness and consistency?
Definition of unbiasedness :
Definition of consistency:
We denote the all the estimators satisfy consistency and
unbiasedness as A and B, respectively. There are four possible
relations between the set A and the set B.
The relation between the set A and the set B belongs to the
third type.
0][^
E
0]|Pr[|lim^
nn
Estimation of E[X] (1/1)
9
Estimate
1. X is a r.v. with pdf and finite variance
2. Repeat the experiment n times with denoting the ith
outcome
3. The are drawn independently from , and
i=1,…..,n.
4.The sample mean estimator is
][XE)(xfX
2
ix
ix )(xfX )()( xfxf XX i
n
i
iXn 1
1ˆ
Unbiasedness (1/1)
10
Since , i=1,…,n, we have
][ iXE
)(1
][1
]1
[]ˆ[
1
1
nn
XEn
Xn
EE
n
i
i
n
i
i
Consistency (1/2)
11
The chebyshev inequality:
(A)
The variance of is obtained as
(B)
2
]ˆ[]ˆPr[
Var
n
Xn
XXn
Xn
E
Xn
EVar
n
i
i
ji
ji
n
i
i
n
i
i
2
1
2
21
2
2
2
1
)]1
(211
[
])1
[(]ˆ[
Consistency (2/2)
12
With (A) and (B), we obtain
2
2
]ˆPr[
n
0]ˆPr[lim
n
Estimation of Var[X] (1/1)
13
X: A random variable with pdf with mean μ and
variance σ2
: n i.i.d. observations
Estimator
where
)(xfX
nXX ,.....,1
2
1
)ˆ(1
1ˆ
n
i
iXn
n
i
iXn 1
1
Unbiasedness (1/1)
14
Unbiasedness of
2
1 1
2 2 2
2 21 1 1
2
1ˆ( 1) [ ] [ ( ) ]
2 2 1 1[ { }]
( 1)
n n
i i
i i
n n n n n
i i i j j j i
i j j j ij i
n E E X Xn
E X X X X X X Xn n n n
n
Consistency (1/2)
15
Consistency of
1.
n
mm
nE
Xn
EXnn
E
Xn
XXXn
EVar
n
i
i
n
ji
j
n
i
n
i
i
nn
44
44
2
2
1
24
22
1
4
2
22
]1
[
])ˆ(1
[])ˆ()1(
1[
])ˆ(1
2
})ˆ()ˆ()ˆ({)1(
1E[
]ˆ[]ˆ[
Consistency (2/2)
16
Consistency of
2. With the chebychev inequality
assuming that exists .
2
4
2
2 ]ˆ[]ˆPr[
n
mVar nn
4m
Exercise 1: Consider the estimation of the mean and variance
of a random noise source X. Assume the random noise source
is Gaussian distributed with mean 0 and variance 1. Design
an estimator that is based on n observations of X. What are
your estimations of its mean and variance if you choose
n=1,10, 100, 1000, and 10000. Is your estimator unbiased?
Consistent?
17
Confidence Intervals (1/13)
18
Make n observations on a Gaussian r.v. X with mean μ and
variance σ2. These observations are drawn from n i.i.d. r.v.’s
is an unbiased estimator of the true mean μ. Since it
involves the sum of independent Gaussian r.v.’s, it also
obeys the Gaussian probability law with mean μ and
variance (Why?)
Normalization Y into N(0,1) (Why?)
n/2
nY
/
ˆ
Confidence Intervals (2/13)
19
Compute the probability of events of the form
whose probability is 95 percent,
Set , then there exists a number b0.95 such that
From table of Q(x) and using linear interpolation we find that
}{ bYa
95.0}Pr{ bYa
ba
95.0]Pr[ 95.095.0 bYb
96.195.0 b
Confidence Intervals (3/13)
20
(D)
The interval of (D) is called the 95 percent confidence
interval for the mean. For every value of that we observe,
we shall usually obtain a different interval. However in 95
out of 100 cases, the interval so generated will cover the true
mean
Exercise 2: Please use Matlab to find b0.95. If you estimate the
mean μ for 100 times, how many estimations lie in the
interval of (D)?
]96.1ˆ96.1[
])96.1(ˆ)96.1[(]96.196.1[
nn
nnY
Confidence Intervals (4/13)
21
Example: Let X be a Gaussian r.v. with σ2 =9. The mean is
estimated from 10 independent observations on X and found
to be 3.5. Find 95 percent confidence interval for μ.
Solution: A 95 percent confidence interval is given by
With σ=3, n=10, and we find that the 95 percent
confidence interval is [1.64, 5.36]; i.e., the probability of the
event is 0.95.
].96.1ˆ96.1[nn
5.3ˆ
}36.564.1{
Confidence Intervals (5/13)
22
The previous discussion assumed that Var(X)= σ2 was known.
If σ is unknown, we can’t use equation (D) to obtain a
confidence interval. Is there a way to obtain a confidence
interval even if μ is unknown? The answer is yes and the
solution is as follows.
We first define a r.v. Z and say that it has a Student t(n)
distribution with degrees of freedom n. The PDF of t(n) is
given by
where .)2/(
]2/)1[()(
,)/1(
)(),()(
2/)1(2
nn
nnk
nz
nknzfzf
ntZ
Confidence Intervals (6/13)
23
The gamma function is defined as
If α is a positive integer then
).()1( ,)( 1
dyey y
o
!)1(
Confidence Intervals (7/13)
24
The Student t r.v.
We can find number and such that
2/1
1
2
)1(
)ˆ(
)( , )(
ˆ)1(
nn
X
nAnA
ntt
n
i
i
t t
t
t
t dznzfttt 21)1,(]Pr[
Confidence Intervals (8/13)
25
,tf z n
t t
Figure 6.2-1 The number tς . The area between –tς and tς is 1-2 ς.
Confidence Intervals (9/13)
26
The number are called the ζ percent level of t
which cut-off ζ percent of the area under .
)}(ˆ)(ˆ{}{ nAtnAtttt
t
)1,( nzft
(E) ])Pr[1(2
1),(
1),(2)1,(]Pr[
),(),(
tttntF
ntFdznzfttt
dznzfntF
t
t
t
t
t
t
tt
Confidence Intervals (10/13)
27
We can find tζ and further find the confidence interval
)](ˆ),(ˆ[ nAtnAt
Confidence Intervals (11/13)
28
Example: Twenty-one independent observations are made on
a Gaussian r.v. X. Call these observations X1,…..,X21 . Based
on the data the realization of is 3.5 and the realization of
A 90 percent confidence interval on μ is desired.
45.0)1(
)ˆ(
)(
2/1
1
2
nn
X
nA
n
i
i
29
Solution: Since , we obtain from
equation (E), . From table of the Student t
distribution, for n=20, we obtain . The
corresponding confidence interval is
9.0]Pr[ 05.005.0 ttt
95.0)20,( 05.0 tFt
725.105.0 t
]28.4,72.2[
)]45.0(725.15.3),45.0(725.15.3[
Confidence Intervals (12/13)
Confidence Intervals (13/13)
30
F n
0.6 0.75 0.9 0.95 0.975 0.99 0.995 0.9995
1 0.325 1 3.078 6.314 12.706 31.821 63.657 636.619
2 0.289 0.816 1.886 2.92 4.303 6.965 9.925 31.598
3 0.277 0.765 1.638 2.353 3.182 4.541 5.841 12.924
4 0.271 0.741 1.533 2.132 2.776 3.747 4.604 8.61
5 0.267 0.727 1.476 2.015 2.571 5.365 4.032 6.569
6 0.265 0.718 1.44 1.943 2.447 3.143 3.707 5.959
7 0.263 0.711 1.415 1.895 2.365 2.998 3.499 5.408
8 0.262 0.706 1.397 1.86 2.306 2.896 3.355 5.041
9 0.261 0.703 1.372 1.833 2.262 2.821 3.25 4.781
10 0.26 0.7 1.363 1.812 2.228 2.764 3.169 4.587
11 0.259 0.697 1.356 1.796 2.201 2.718 3.106 4.437
12 0.259 0.695 1.35 1.782 2.179 2.681 3.055 4.318
13 0.258 0.694 1.345 1.771 2.16 2.65 3.012 4.221
14 0.258 0.692 1.341 1.761 2.145 2.624 2.977 4.14
15 0.258 0.691 1.337 1.753 2.131 2.602 2.947 4.073
16 0.257 0.69 1.333 1.746 2.12 2.583 2.921 4.015
17 0.257 0.689 1.33 1.74 2.11 2.567 2.898 3.965
18 0.257 0.688 1.328 1.734 2.101 2.552 5.878 3.922
19 0.257 0.688 1.325 1.729 2.093 2.539 2.861 3.883
20 0.257 0.687 1.325 1.725 2.086 2.528 2.845 3.85
Table 6.2-1 Student t distribution
, ,t
t tF t n f z n dz