simple linear regression estimation and properties
TRANSCRIPT
Simple Linear RegressionEstimation and Properties
Outline• Review of the Reading• Estimate parameters using OLS• Other features of OLS– Numerical Properties of OLS– Assumptions of OLS– Goodness of Fit
Checking Understanding• What is the best estimate of E(Y)?• How would we find E(Y|Xi)?
• Y = B1 + B2X + u– What is B1?– What is B2?– What is u?
Checking Understanding• What is a z-score?
• What is the mean of z(x)?• What is the standard deviation of z(x)?
z(x) =x� x
�x
Checking Understanding• What is a z-score?
• Correlation:
r =
Pzxzy
n� 1
z(x) =x� x
�x
Checking Understanding• Correlation:
• The regression line in z-scores:
r =
Pzxzy
n� 1
zy = mzx
Checking Understanding• Correlation:
• The regression line in z-scores:• Can also be written as:
• Can also be written as:
r =
Pzxzy
n� 1
zy = mzxzy = mzx
zy = rzx
Checking Understanding• Correlation:
• The regression line in z-scores:• Can also be written as: • Can also be written as:
• Remember:
r =
Pzxzy
n� 1
zy = mzxzy = mzxzy = rzx
m =cov(X,Y )
var(x)
And What is Covariance?
• Cov(X,Y) = E[(X-E[X])(Y-E[Y])]• Cov(X,Y) = E[XY]-E[X]E[Y]• Covariance is positive if x and y are both
below their mean or both above their mean. It is negative if x is above its mean while y is below its mean or vice versa.
�xy = cov(X,Y ) = E[(X � µx)(Y � µy)]
�xy = cov(X,Y ) = E[(X � x)(Y � y)]
And What is Covariance?
• Cov(X,Y) = E[ ( X - E[X] ) ( Y - E[Y] ) ]• Cov(X,Y) = E[XY] - E[X] E[Y]• Covariance is positive if x and y are both
below their mean or both above their mean. It is negative if x is above its mean while y is below its mean or vice versa.
• But it has units. It is easy to interpret the sign, but hard to interpret the number
�xy = cov(X,Y ) = E[(X � µx)(Y � µy)]�xy = cov(X,Y ) = E[(X � x)(Y � y)]
Total Population of Money Spent and the Number of Votes
Effect of Money on Votes
Num
ber o
f Vot
es
0
12500
25000
37500
50000
Amount Spent- in millions0 3 5 8 10
What we can see from the graph
• We can see the average value of Y for each value of X– These are the conditional expected values E(Y|X)
• If we join the conditional values of Y given each value of X we get the – Population Regression Line
Population Regression Function and the Linear Model
• E(Y|Xi)=f(Xi)– The expected value of the distribution of Y,
given Xi is functionally related to Xi
• E(Y|Xi)=B1+B2Xi
Two interpretations of linearity• Linear in Variables
– Which of the following is linear in variables and why?:• E(Y|Xi)=B1+B2Xi
2
• E(Y|Xi)=B1+B2Xi
• Linear in Parameters– Which of the following is linear in parameters and why?
• E(Y|Xi)=B1+B2Xi2
• E(Y|Xi)=B1+B22Xi
• Why Should We Care?– Linear Regression Requires linearity in parameters only
Straight Line
Y=B1+B2Xi
Quadratic
Y=B1+B2X+B3X2
Adding in the Stochastic Term
• Yi=E(Y|Xi) + ui
• Systematic Component: E(Y|Xi)• Stochastic Disturbance: U
The Sample Regression Function (SRF)
• Because of sampling fluctuation, any sample will only approximate our true Population Regression Function
• Stochastic form of the SRF:
Primary Goal in Regression Analysis
• We want to estimate the PRF– Yi=B1+B2Xi+ui
• On the basis of the SRF
One method• Choose the Sample Regression Function
such that the sum of the residuals is as small as possible
Illustration and Problem
X
Y
u1=10
u2=-2
u3=2
u4=-10
Alternative Method• Ordinary Least Squares (OLS) is a method of
finding the linear model which minimizes the sum of the squared errors.
– Example: (10)2 + (-2)2 + (2)2 + (-10)2 = 208
• This method is the best, linear unbiased estimator
Good Spot for a break
Minimizing the Sum of Squares• Our goal is to minimize the sum of the
squared errors.
• Since we have two unknowns, B1 and B2, we need to take the partial derivatives for the following equation:
Partial Derivatives for B’s
• We start with our original equation:
• Now we take the partial derivatives– First equation is the partial derivative with respect to
B1,
– Second equation is with respect to B2
Set Equal to Zero• Last set of equations:
• Next:
The Normal Equations• Last:
• Divide both equations by –2• Multiply through• Separate summation terms and rearrange:
Rewriting the Equation• Last Equation:
• We can rewrite
Solving Equation• We have two equations with two unknowns, for which we
can use algebra
• Multiply first equation by sum of Xi and second by n• End up with…
Subtract first equation from second and rearranging
Last step• Last equation
• Multiply numerator and denominator by 1/n…recall that
• End up with
We can now solve for B1
• If we go back to the first normal equation:
What Does B2 Mean?
• Equation for B2 may not seem to make intuitive sense at first
• But if we break it down into pieces we can begin to see the logic
In sum…
• If the changes in X are EQUAL to the changes in y, then B2 = 1
• If the changes in Y are LARGER than the changes in X, then B2 > 1
• If the changes in Y are SMALLER than the changes in X, then B2 < 1
Let’s Do An Example!
Calculating a and b• Mean of X is 4• Mean of Y is 12.71429
Calculating B1 and B2
Which Looks Like…This!Regression of Y on X
0
8
15
23
30
0 2 4 6 8
Practice Problem• We have a sample of the amount of
money a each candidate spent in a state (in millions) and the percentage of the vote they received.
• Calculate the regression line and interpret.
Data
State % vote Money spentCA 40 10FL 35 12GA 15 4MO 20 6OH 40 11VT 25 8
Numerical Properties of OLS• Those properties that result from the method of
OLS– Expressed from observable quantities of X and Y– Point Estimator for B’s– Sample regression line passes through sample
means of Y and X– Sum of residuals is zero– Residuals are uncorrelated with the predicted Yi
– Residuals uncorrelated with Xi
Assumptions of Classical Linear Regression
• A1: Linear Regression Model-Linear in parameters
• A2: X values are fixed in repeated sampling.
• A3: Zero mean value of the disturbance term ui
• A4: Homoskedasticity or Equal Variance of ui.
More Assumptions• A5: No autocorrelation between disturbances
• A6: Zero covariance between ui and Xi
• A7: Number of observations n is greater than the number of parameters to be estimated
• A8: Variability in X values
More Assumptions• A9: Regression model is correctly
specified.– The correct variables are included– We have the correct functional form– Correct assumptions about the probability
distributions of Yi, Xi and ui.• A10: With multiple regression, we add the
assumption of no perfect multicollinearity
How “good” does it fit?
• To measure “reduction in errors” we need a benchmark for comparison.
• The mean of the dependent variable is a relevant and tractable benchmark for comparing predictions.
• The mean of Y represents our “best guess” at the value of Yi absent other information.
Sums of Squares
• This gives us the following 'sum-of-squares' measures:
• Total Variation = Explained Variation + Unexplained Variation
How well does our model perform?
• R squared statistic– = TSS-USS/TSS– =ESS/TSS• Bounded between 0 and 1• Higher values indicate a better fit• Lower values more unexplained than explained
variance