regression estimators
TRANSCRIPT
-
7/28/2019 Regression Estimators
1/10
Simple Linear Regression Maximum LikelihoodEstimation
January 20, 2010
Tiejun (Ty) Tong
Department of Applied Mathematics
http://find/http://goback/ -
7/28/2019 Regression Estimators
2/10
Simple Linear Regression
A simple linear regression model is defined as
Yi = 0 + 1xi + i,
where
Yi is the response values,xi is the predictor values,0 is the intercept,1 is the slope,i are i.i.d. random variables from N(0,
2).
For ease of notation, denote x = 1n
ni=1 xi, Y =
1
n
ni=1 Yi,
Sxx =n
i=1(xi x)
2, and Sxy =n
i=1(xi x)(Yi Y).
http://find/http://goback/ -
7/28/2019 Regression Estimators
3/10
Least Square Estimation
The LS estimates of0 and 1 are defined to be the values of0 and 1 such that the line 0 + 1x minimizes the RSS.
(0, 1) = argminc,d
n
i=1
(Yi (c + dxi))2.
The LS estimators of0 and 1 are
1 = Sxy/Sxx,
0 = Y 1x.
Given 0 and 1, the fitted linear regression model is
Y = 0 + 1x.
http://find/ -
7/28/2019 Regression Estimators
4/10
Least Square Estimation
The difference between the observed value Yi and the fittedvalue Yi is called a residual. We denote it as
ei = Yi Yi = Yi (0 + 1xi), i = 1, . . . , n.
An unbiased estimator of2 is given as
2 =RSS
n 2=
1
n 2
ni=1
e2
i .
The coefficient of determination, denoted by r2, is given by
r2 = 1
RSS
SST= 1
ni=1(Yi Yi)
2
ni=1(Yi Y)
2.
http://find/ -
7/28/2019 Regression Estimators
5/10
Maximum Likelihood Estimation
The lease squares method can be used to estimate 0 and 1regardless of the distribution form of the error term (eithernormal or non-normal errors).
For Inference problems such as hypothesis testing andconfidence interval construction, we need to assume that thedistribution of the errors are known.
For a simple linear regression model, we assume that
ii.i.d. N(0, 2), i = 1, . . . , n.
Thus for fixed design points xi, the observations Yi areindependently r.v.s with distribution
Yi N(0 + 1xi, 2), i = 1, . . . , n.
http://find/ -
7/28/2019 Regression Estimators
6/10
-
7/28/2019 Regression Estimators
7/10
Maximum Likelihood Estimation
Taking the first partial derivatives of the log-likelihood
function on 0, 1 and 2, we haven
i=1
(Yi 0 1xi) = 0,
ni=1
xi(Yi 0 1xi) = 0,
n
i=1(Yi 0 1xi)
2 = n2.
Solving the above equations leads to
1,ML = Sxy/Sxx, 0,ML = Y 1,MLx, and 2
ML =1
n
ni=1
e2
i .
Note that the ML estimators of0
and 1
areidentical
to theLS estimators of0 and 1.
http://find/ -
7/28/2019 Regression Estimators
8/10
Properties of 0 and 1
First, 0 and 1 can be represented as linear combinationsof the observations Yi:
1 =1
Sxx
ni=1
(xi x)(Yi Y) =
ni=1
ciYi,
0 = 1n
ni=1
Yi
ni=1
cix Yi =
ni=1
( 1n cix)Yi,
where ci = (xi x)/Sxx.
Second, 0 and 1 are unbiased estimators of0 and 1,
respectively. For example,
E(1) =n
i=1
ci(0 + 1xi) = 0
ni=1
ci + 1
ni=1
cixi = 1,
where
ni=1 ci = 0 and
ni=1 cixi = 1.
http://find/http://goback/ -
7/28/2019 Regression Estimators
9/10
Properties of 0 and 1
The variances of 0 and 1 are
Var(1) =n
i=1c2
i Var(Yi) =2
Sxx,
Var(0) = Var(Y) + x2Var(1) =
2( 1n
+ x2
Sxx),
where
ni=1 c
2
i = 1/Sxx, and the covariance of Y and 1 iszero.
Lastly, it can be shown that 0 and 1 are the Best LinearUnbiased Estimators (BLUE) of0 and 1, where thebest implies the minimum variance. This result is called theGauss-Markov Theorem.
http://find/http://goback/ -
7/28/2019 Regression Estimators
10/10
Distributions of the Estimators
Theorem: Let Z1, . . . , Zn be mutually independent randomvariables with Zi N(i, 2i ). Let a1, . . . , an and b1, . . . , bnbe fixed constants. Then
Z =n
i=1
(aiZi + bi) Nn
i=1
(aii + bi),n
i=1
a2
i2
i .The distributions of 0 and 1 are
0 N(0, 2(
1
n+
x2
Sxx)) , 1 N(1,
2
Sxx) .Furthermore, (0, 1) and
2 (unbiased estimator) areindependent and
(n 2)2
2 2n2.
http://find/