least squares estimate additional notes 1. introduction the quality of an estimate can be judged...

19
Least Squares Estimate Additional Notes 1

Upload: arnold-atkinson

Post on 17-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

Assumptions Assume that there exists a true parameter vector θ 0 so that the data satisfy: The stochastic process e(k) is zero-mean white noise, with variance σ 2 at every step. Intuition: The assumption says that the true system can be represented by the chosen model, up to some errors that are well-behaved in a statistical sense. 3

TRANSCRIPT

Page 1: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

1

Least Squares Estimate

Additional Notes

Page 2: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

2

Introduction

• The quality of an estimate can be judged using the expected value and the covariance matrix of the estimated parameters.

• The following theorem is an important tool to determine the quality of the least squares estimate.

Page 3: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

3

Assumptions

• Assume that there exists a true parameter vector θ0

so that the data satisfy:

• The stochastic process e(k) is zero-mean white noise, with variance σ2 at every step.

• Intuition: The assumption says that the true system can be represented by the chosen model, up to some errors that are well-behaved in a statistical sense.

)()()( 0 kekky T

Page 4: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

4

Theorem

• Part 1: The solution of the least-squares problem is an unbiased estimate of θ0, that is:

• Part 2: The covariance matrix P of the estimated parameter vector using least squares is given by

0]ˆE[

nnTP 12 )()ˆcov(

Page 5: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

5

Remark on Part 1

Part 1 of the theorem says that the solution makes (statistical) sense on average.

Page 6: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

6

Remarks on Part 2

• In the covariance matrix P,

the ith prime diagonal element Pii is the variance of the ith parameter The low is the variance of the estimated parameter, the higher is the confidence that is close to the true value θi.

• Also, it can be seen that smaller errors e(k) (i.e. smaller variance σ2) yield smaller parameter covariance.

nnTP 12 )()ˆcov(

Page 7: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

7

Remarks on Part 2, continued• The calculation of the covariance matrix P assumes that we know

the variance σ2 of the measurement error e which is not known beforehand. A valid estimate of σ2 is given by

• Where N is the number of I/O data samples and n is the number of parameters to be estimated.

• Recall that the cost function evaluated at is

nNee

nNV T

)ˆ(ˆ 2

YYYY

YYeeVTTTT

LST

LST

LS

1)(

)ˆ()ˆ()ˆ(

Page 8: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

8

Confidence intervals for the parameters

• If the residuals e are normally distributed, the confidence interval of the parameters can be computed. The confidence limits give a rough idea of the significance of the estimates.

• For example, when a parameter that is expected to lie between 0 and 1 turns out to be 0.4 with 95% confidence region of ±0.005, it is safe to say that it is a good estimate of that parameter. However, when the 95% confidence region around the parameter is ±2, the estimate is meaningless.

Page 9: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

9

Confidence intervals for the parameters

For any Gaussian distribution N(µ, σ2), the probability of drawing a sample in the interval [µ−1.96σ, µ+1.96σ] is found (from the Gaussian formula) to be 0.95.

Page 10: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

10

• For N >> n, is distributed according to Gaussian distribution N(θi,Pii). Therefore:

• Hence, the 95% confidence limits of the true parameter θi are

• Note that this interval is valid if the off-diagonal elements of P are small compared to the main diagonal elements. In practice, this requirement can be relaxed and the given interval is sufficient for deciding whether an estimate should be accepted or rejected.

,96.1ˆiii P

Confidence intervals for the parameters

95.096.1ˆPr iiii P

Page 11: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

11

Example• Consider the following model and the input-output data:

Use the least squares method to find the unknown parameters a and b along with their confidence intervals.

)()1()1()( kekbukayky

k 0 1 2 3 4 5 6 7 8u(k) 1 1 1 1 0 0 0 0 *y(k) 0 0.57 0.97 1.26 1.47 1.06 0.76 0.54 0.39

Page 12: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

12

Solution

• First, we form the vector y and the matrix as

.

054.0076.0006.1047.1126.1197.0157.010

)7()7()6()6()5()5()4()4()3()3()2()2()1()1()0()0(

,

39.054.076.006.147.126.197.057.0

)8()7()6()5()4()3()2()1(

uyuyuyuyuyuyuyuy

yyyyyyyy

y

Page 13: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

13

• We calculate the least squares estimate of a and b as

• Hence, the model of least squares fit is

yba TT

1)(ˆ

ˆ̂

)1(565.0)1(718.0)( kukyky

.565.0718.0

39.054.076.006.147.126.197.057.0

0000111154.076.006.147.126.197.057.00

054.0076.0006.1047.1126.1197.0157.010

0000111154.076.006.147.126.197.057.00

1

Page 14: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

14

Solution, continued

• The residual errors e can be calculated as:

• The number of data points is N = 8, and the number of unknown parameters is n = 2. Hence, an estimate of the noise variance is

52 1074.128

)ˆ(ˆ

ee

nNV T

T] 0.0024 0.0056- 0.0009- 0.0047 0.0005 0.0013- 0.0042- [0.005

ˆ

ye

Page 15: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

15

Solution, continued

• The covariance matrix of the estimated parameters is

• The relatively large magnitudes of the off-diagonal terms in P indicate that there is a strong interaction between the estimates. This is typical in dynamic systems.

6044.02415.02415.0345.0

10)()ˆcov( 512 TP

Page 16: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

16

• The 95% confidence limits are roughly given by

• The confidence regions are so small that the estimates can be accepted without much reservation.

0048.0565.0106044.096.1565.0ˆ

0036.0718.010345.096.1718.0ˆ

5

5

b

a

Solution, continued

Page 17: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

17

Visualization of the confidence interval

• To visualize the confidence intervals, imagine that we repeat the previous experiment 1000 times.

• In each experiment, we use 8 input-output data points and the least squares to find an estimate of the parameters.

• The empirical distribution (histogram) of the estimated parameters are shown in the next slide.

Page 18: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

18

Visualization of the confidence interval

The confidence interval describes roughly the width of the histogram.

Page 19: Least Squares Estimate Additional Notes 1. Introduction The quality of an estimate can be judged using the expected value and the covariance matrix of

19

Acknowledgement Most of the material in this course are due to:

(1) Prof. Abdel-Latif El-Shafei , Department of Electrical power Engineering, Cairo University.

(2) Dr Lucian Busoniu, Technical University of Cluj-Napoca, Romania. http://busoniu.net/teaching/sysid2014/

(3) Prof. Xia Hong, University of Reading, England. http://www.personal.reading.ac.uk/~sis01xh/lecturenotes.html