econometrics the simple regression modeldocentes.fe.unl.pt/~azevedoj/web...
TRANSCRIPT
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
EconometricsThe Simple Regression Model
Joao Valle e Azevedo
Faculdade de EconomiaUniversidade Nova de Lisboa
Spring Semester
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 1 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Objectives
Objectives
Given the model
y = β0 + β1x + u
Where y is earnings and x is years of education
Or y is sales and x is spending in advertising
Or y is the market value of an apartment and x is its area
Our objectives for the moment will be:
Estimate the unknown parameters β0 and β1
Assess how much x explains y
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 2 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Assumptions
The Gauss-Markov Assumptions
Assumption SLR.1 (Linearity in Parameters)
y = β0 + β1x + u
I y is the dependent variable (or explained variable, or response variable,or the regressand)
I x is the independent variable (or explanatory variable, or controlvariable, or the regressor)
I u is the error term or disturbance (y is allowed to vary for the samelevel of x)
This is a population equation. It is linear in the (unknown)parameters.
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 3 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Assumptions
The Gauss-Markov Assumptions
Assumption SLR.2 (Random Sampling) Random sample of size n,{(xi ,yi ): i=1,2,...,n} such that:
yi = β0 + β1xi + ui , i = 1, 2, ..., n
Assumption SLR.3 (Sample Variation in the ExplanatoryVariable)The sample outcomes on x, namely {xi : i=1,2,...,n}, are not all thesame value
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 4 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Assumptions
The Gauss-Markov Assumptions
Assumption SLR.4 (Zero Conditional Mean) The error u has anexpected value of zero given any value of the explanatory variable
E (u|x) = 0
Crucial Assumption!Remember the previous examples:
Earnings = β0 + β1Education + u
CrimeRate = β0 + β1Policeman + u
In these cases SLR.4 most likely fail
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 5 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Assumptions
The Gauss-Markov Assumptions
Assumption SLR.4’ (Zero Mean) The error u has an expected valueof zero given any value of the explanatory variable
E (u) = 0
Not really necessary given that SLR.4 implies SLR.4’
I This is not a restrictive assumption since we can always use β0 so thatSLR.4’ holds
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 6 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Assumptions
Zero Conditional Mean
E (u|x) = 0⇒ E (y |x) = β0 + β1x
..
x1 x2
E(y|x) = b0 + b1x
y
f(y)
Figure: For any x, the distribution of y is centered about E(y |x)Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 7 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Assumptions
Graph of yi=β0+β1xi+ui
.
..
.
y4
y1
y2
y3
x1 x2 x3 x4
}
}
{
{
u1
u2
u3
u4
x
y xxyE 10)|(
Figure: Population regression line, data points and the associated error terms
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 8 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Estimators
OLS Estimators
Want to estimate the population parameters from a sample
Let {(xi ,yi ): i=1,2,...,n} denote a random sample of size n from thepopulation
For each observation in this sample:
yi = β0 + β1xi + ui , i = 1, 2, ..., n
To derive the OLS estimates we need to realize that our mainassumption of E(u|x)=E(u)=0 also implies:
Cov(x , u) = E (xu) = 0
Remember from basic probability that Cov(x,y) = E(xy)-E(x)E(y)
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 9 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Estimators
OLS EstimatorsTwo Restrictions:
E (u|x) = E (u) = 0
Cov(x,u) = E (xu) = 0
Since u = y − β0 − β1x :I Two Moment Conditions:
E (y − β0 − β1x) = 0
E [x(y − β0 − β1x)] = 0I Sample versions:
n−1n∑
i=1
(yi − β0 − β1xi ) = 0
n−1n∑
i=1
xi (yi − β0 − β1xi ) = 0
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 10 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Estimators
OLS Estimators
We have two equations and two unknowns β0, β1
Pick β0 and β1 so that the sample moments match the populationmoments
From the first sample moment condition and given the definitionof a sample mean and properties of summation:
n−1n∑
i=1
(yi − β0 − β1xi ) = 0
⇔ y = β0 + β1x
⇔ β0 = y − β1x
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 11 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Estimators
OLS EstimatorsFrom the second sample moment condition:
n−1n∑
i=1
xi (yi − β0 − β1xi ) = 0
Plug-in β0 ⇒n∑
i=1
xi (yi − (y − β1x)− β1xi ) = 0
This simplifies to⇒n∑
i=1
xi (yi − y) = β1
n∑i=1
xi (xi − x)
Rearranging⇒n∑
i=1
(xi − x)(yi − y) = β1
n∑i=1
(xi − x)2
Thus, β1 =
∑ni=1(xi − x)(yi − y)∑n
i=1(xi − x)2
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 12 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Estimators
OLS Estimators
β1 =
∑ni=1(xi − x)(yi − y)∑n
i=1(xi − x)2
Need x to vary in our sample so that the slope parameter iswell-defined!
n∑i=1
(xi − x)2 > 0
Sample covariance between x and y divided by the sample variance ofx
If x and y are positively correlated, the slope will be positive
If x and y are negatively correlated, the slope will be negative
I It is an estimator if we treat (xi , yi ) as random variablesI It is an estimate once we substitute (xi , yi ) by the actual observations
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 13 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Estimators
Example
ˆwagei = 245.073 + 52.513educi
n = 11064, Data from the Employment Survey (INE), 2003
wage is net monthly wage and educ measures the number of years ofschooling completed
So, we estimate a positive relation between these two variables:
I An additional year of schooling leads to an estimated expected increaseof 52.513 euros in the wage on average, ceteris paribus
I Caution about the interpretation of these estimates!
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 14 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Definitions
Some Definitions
Intuitively, OLS is fitting a line through the sample points such that theSSR is as small as possible, hence the term least squares
The residual, u, is an estimate of the error term, u, and is the differencebetween the fitted line (sample regression function) and the sample point
Fitted Valueyi = β0 + β1xi
Residualui = yi − yi = yi − β0 − β1xi
Sum of Squared Residuals (SSR)
n∑i=1
ui2 =
n∑i=1
(yi − β0 − β1xi )2
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 15 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Definitions
Graph of yi=β0+β1xi
.
..
.
y4
y1
y2
y3
x1 x2 x3 x4
}
}
{
{
û1
û2
û3
û4
x
y
xy 10ˆˆˆ
Figure: Sample regression line, sample data points and the associated estimatederror terms - the residuals
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 16 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Definitions
Alternative approach to derive OLS
We want to choose our parameters such that we minimize the SSR:
SSR =n∑
i=1
ui2 =
n∑i=1
(yi − β0 − β1xi )2
δRSS
δβ0
= −2n∑
i=1
(yi − β0 − β1xi ) = 0
δRSS
δβ1
= −2n∑
i=1
xi (yi − β0 − β1xi ) = 0
These equations are equivalent to the same equations we had before:thus, same solution!
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 17 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
More Terminology
More TerminologyThink of each observation as being made up of an explained part and anunexplained part:
yi = yi + uiTotal Sum of Squares (SST)
n∑i=1
(yi − y)2
Explained Sum of Squares (SSE)
n∑i=1
(yi − y)2
Residual Sum of Squares (SSR)
n∑i=1
ui2
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 18 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
More Terminology
SST = SSE + SSR
SST =n∑
i=1
(yi − y)2 =n∑
i=1
[(yi − yi ) + (yi − y)]2
=n∑
i=1
[ui + (yi − y)]2
=n∑
i=1
ui2 + 2
n∑i=1
ui (yi − y) +n∑
i=1
(yi − y)2
=n∑
i=1
ui2 +
n∑i=1
(yi − y)2
= SSR + SSE
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 19 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
More Terminology
Goodness of Fit
How well does our sample regression line fit our sample data?
I If SSR is very small relative to SST (or SSE very large relative to SST),then a lot of the variation in the dependent variable y is explained bythe model
Can compute the fraction of the total sum of squares (SST) that isexplained by the model: the R-squared of the regression
R2 =SSE
SST= 1− SSR
SST
0 ≤ R2 ≤ 1
I The lower the SSR, the higher will be the R2
I OLS maximises the R2 (minimises SSR)
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 20 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
More Terminology
Goodness of Fit - Example (Cont.)
ˆwagei = 245.073 + 52.513educi
n = 11064, Data from the Employment Survey (INE), 2003
R2 = 0.262
A lot of variation in the dependent variable (wage) is left unexplainedby the model
This is not related to the fact that educ is most probably correlatedwith u
Can have low R2 even if all the assumptions hold true
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 21 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Properties of OLS EstimatorsThought experiment:
I Get a sample of size n from the population many times (infinite times)I Compute the OLS estimatesI Is the average of these estimates equal to the true parameter values?I Under SLR.1 to SLR.4 it turns out that the answer is positive
β1 =
∑ni=1(xi − x)(yi − y)∑n
i=1(xi − x)2
β0 = y − β1x
I These are estimators if we treat (xi , yi ) as random variables
Theorem
Under these assumptions:
I E(β0)=β0 and E(β1)=β1
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 22 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Proof of Unbiasedness of OLS
β1 =
∑ni=1(xi − x)(yi − y)∑n
i=1(xi − x)2
=
∑ni=1(xi − x)(yi − y)
SSTx
where:
SST =n∑
i=1
(xi − x)2
Note that:n∑
i=1
(xi − x) = 0 and
n∑i=1
(xi − x)xi =n∑
i=1
(xi − x)2
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 23 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Proof of Unbiasedness of OLS (Cont.)Focusing on the numerator:
n∑i=1
(xi − x)(yi − y) =n∑
i=1
(xi − x)yi
=n∑
i=1
(xi − x)(β0 + β1xi + ui )
= β1SSTx +n∑
i=1
(xi − x)ui
Putting together:
β1 =β1SSTx +
∑ni=1(xi − x)ui
SSTx
= β1 +
∑ni=1(xi − x)ui
SSTx
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 24 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Proof of Unbiasedness of OLS (Cont.)
β1 = β1 +
∑ni=1(xi − x)ui
SSTx
Now take expectations conditional on the values of the x’s, that is,treat the x ’s as constants. To avoid heavy notation that isn’t doneexplicitly
E (β1) = β1 +
∑ni=1(xi − x)E (ui )
SSTx
= β1 +
∑ni=1(xi − x)0
SSTx
= β1
And if the conditional expectation is a constant, the unconditinalexpectation is also that constant...
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 25 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Proof of Unbiasedness of OLS (Cont.)
β0 = y − β1x
Now, it’s really easy to prove unbiasedness of β0. Try it yourself!
Unbiasedness Summary
I The OLS estimators of β0 and β1 are unbiased
I However, in a given sample we may be close or far from the trueparameter, unbiasedness is a minimal requirement
I Unbiasedness depends on the 4 assumptions: if any assumption fails,then OLS is not necessarily unbiased
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 26 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Variance of the OLS Estimators
We know already that the sampling distribution of our estimators iscentered around the true parameter
But, is the distribution concentrated around the true parameter?
To answer this question, we add an additional assumption:
I Assumption SLR.5 (Homoskedasticity) The error u has the samevariance given any value of the explanatory variable.
Var(u|x) = σ2
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 27 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Variance of the OLS Estimators
Remember that:
σ2 = Var(u|x) = E (u2|x)− [E (u|x)]2
= E (u2|x),given that E (u|x)=0
= E (u2) = Var(u)
I So, σ2 is also the unconditional variance, called the variance of the errorI σ, the square root of the error variance, is called the standard deviation
of the error
Can write:
E (y |x) = β0 + β1x and Var(y |x) = σ2
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 28 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Homoskedastic Case
..
x1 x2
E(y|x) = b0 + b1x
y
f(y)
Figure: How spread out is the distribution of the estimator
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 29 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Heteroskedastic Case
.
xx1 x2
f(y|x)
x3
..
E(y|x) = b0 + b1x
Figure: How spread out is the distribution of the estimator
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 30 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Variance of the OLS Estimators (Cont.)
β1 = β1 +
∑ni=1(xi − x)ui
SSTx,where SSTx =
n∑i=1
(xi − x)2
Var(β1) = Var
(β1 +
1
SSTx
n∑i=1
(xi − x)ui
)
=
(1
SSTx
)2
Var
( n∑i=1
(xi − x)ui
)
=
(1
SSTx
)2 n∑i=1
(xi − x)2Var(ui )
=
(1
SSTx
)2 n∑i=1
(xi − x)2σ2 = σ2
(1
SSTx
)2
SSTx
This is a conditional variance!
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 31 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Variance of the OLS Estimators (Cont.)
Var(β1) =σ2
SSTx,where SSTx =
n∑i=1
(xi − x)2
The larger the error variance, σ2, the larger the variance of the slopeestimator
The larger the variability in the xi , the smaller the variance of theslope estimate
SSTx tends to increase with the sample size n, so variance tends todecrease with the sample size
Can also show:
Var(β0) =n−1σ2
∑ni=1 x
2i∑n
i=1(xi − x)2
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 32 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Variance of the OLS
Theorem
Under Assumptions SLR.1 through SLR.5
Var(β1) =σ2∑n
i=1(xi − x)2
Var(β0) =n−1σ2
∑ni=1 x
2i∑n
i=1(xi − x)2
conditional on the sample values {x1, x2, ...}
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 33 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Properties of OLS
Estimating the Error Variance
In practice, the error variance σ2 is unknown since we don’t observethe errors, we must estimate it...
σ2 =
∑ni=1 ui
2
n − 2=
SSR
n − 2
Can be show that this is an unbiased estimator of the error variance
The so-called standard error of the regression (SER) is given by:
σ =√σ2
Recall that:
se(β1) =σ
[∑n
i=1(xi − x)2]1/2,and similarly for β0
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 34 / 35
SLR Model Gauss-Markov OLS Estimation More OLS Terminology Unbiasedness Variance Example
Example
Example (Cont.)
ˆwagei = 245.073 + 52.513educi
n = 11064, Data from the Employment Survey (INE), 2003
R2 = 0.262
σ2 = 155366 and σ = 394.164
se(β0) = 7.575 and se(β1) = 0.837
Joao Valle e Azevedo (FEUNL) Econometrics Lisbon, February 2011 35 / 35