simple linear regression estimation and properties

47
Simple Linear Regression Estimation and Properties

Upload: others

Post on 24-Feb-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Simple Linear Regression Estimation and Properties

Simple Linear RegressionEstimation and Properties

Page 2: Simple Linear Regression Estimation and Properties

Outline• Review of the Reading• Estimate parameters using OLS• Other features of OLS– Numerical Properties of OLS– Assumptions of OLS– Goodness of Fit

Page 3: Simple Linear Regression Estimation and Properties

Checking Understanding• What is the best estimate of E(Y)?• How would we find E(Y|Xi)?

• Y = B1 + B2X + u– What is B1?– What is B2?– What is u?

Page 4: Simple Linear Regression Estimation and Properties

Checking Understanding• What is a z-score?

• What is the mean of z(x)?• What is the standard deviation of z(x)?

z(x) =x� x

�x

Page 5: Simple Linear Regression Estimation and Properties

Checking Understanding• What is a z-score?

• Correlation:

r =

Pzxzy

n� 1

z(x) =x� x

�x

Page 6: Simple Linear Regression Estimation and Properties

Checking Understanding• Correlation:

• The regression line in z-scores:

r =

Pzxzy

n� 1

zy = mzx

Page 7: Simple Linear Regression Estimation and Properties

Checking Understanding• Correlation:

• The regression line in z-scores:• Can also be written as:

• Can also be written as:

r =

Pzxzy

n� 1

zy = mzxzy = mzx

zy = rzx

Page 8: Simple Linear Regression Estimation and Properties

Checking Understanding• Correlation:

• The regression line in z-scores:• Can also be written as: • Can also be written as:

• Remember:

r =

Pzxzy

n� 1

zy = mzxzy = mzxzy = rzx

m =cov(X,Y )

var(x)

Page 9: Simple Linear Regression Estimation and Properties

And What is Covariance?

• Cov(X,Y) = E[(X-E[X])(Y-E[Y])]• Cov(X,Y) = E[XY]-E[X]E[Y]• Covariance is positive if x and y are both

below their mean or both above their mean. It is negative if x is above its mean while y is below its mean or vice versa.

�xy = cov(X,Y ) = E[(X � µx)(Y � µy)]

�xy = cov(X,Y ) = E[(X � x)(Y � y)]

Page 10: Simple Linear Regression Estimation and Properties

And What is Covariance?

• Cov(X,Y) = E[ ( X - E[X] ) ( Y - E[Y] ) ]• Cov(X,Y) = E[XY] - E[X] E[Y]• Covariance is positive if x and y are both

below their mean or both above their mean. It is negative if x is above its mean while y is below its mean or vice versa.

• But it has units. It is easy to interpret the sign, but hard to interpret the number

�xy = cov(X,Y ) = E[(X � µx)(Y � µy)]�xy = cov(X,Y ) = E[(X � x)(Y � y)]

Page 11: Simple Linear Regression Estimation and Properties

Total Population of Money Spent and the Number of Votes

Effect of Money on Votes

Num

ber o

f Vot

es

0

12500

25000

37500

50000

Amount Spent- in millions0 3 5 8 10

Page 12: Simple Linear Regression Estimation and Properties

What we can see from the graph

• We can see the average value of Y for each value of X– These are the conditional expected values E(Y|X)

• If we join the conditional values of Y given each value of X we get the – Population Regression Line

Page 13: Simple Linear Regression Estimation and Properties

Population Regression Function and the Linear Model

• E(Y|Xi)=f(Xi)– The expected value of the distribution of Y,

given Xi is functionally related to Xi

• E(Y|Xi)=B1+B2Xi

Page 14: Simple Linear Regression Estimation and Properties

Two interpretations of linearity• Linear in Variables

– Which of the following is linear in variables and why?:• E(Y|Xi)=B1+B2Xi

2

• E(Y|Xi)=B1+B2Xi

• Linear in Parameters– Which of the following is linear in parameters and why?

• E(Y|Xi)=B1+B2Xi2

• E(Y|Xi)=B1+B22Xi

• Why Should We Care?– Linear Regression Requires linearity in parameters only

Page 15: Simple Linear Regression Estimation and Properties

Straight Line

Y=B1+B2Xi

Page 16: Simple Linear Regression Estimation and Properties

Quadratic

Y=B1+B2X+B3X2

Page 17: Simple Linear Regression Estimation and Properties

Adding in the Stochastic Term

• Yi=E(Y|Xi) + ui

• Systematic Component: E(Y|Xi)• Stochastic Disturbance: U

Page 18: Simple Linear Regression Estimation and Properties

The Sample Regression Function (SRF)

• Because of sampling fluctuation, any sample will only approximate our true Population Regression Function

• Stochastic form of the SRF:

Page 19: Simple Linear Regression Estimation and Properties

Primary Goal in Regression Analysis

• We want to estimate the PRF– Yi=B1+B2Xi+ui

• On the basis of the SRF

Page 20: Simple Linear Regression Estimation and Properties

One method• Choose the Sample Regression Function

such that the sum of the residuals is as small as possible

Page 21: Simple Linear Regression Estimation and Properties

Illustration and Problem

X

Y

u1=10

u2=-2

u3=2

u4=-10

Page 22: Simple Linear Regression Estimation and Properties

Alternative Method• Ordinary Least Squares (OLS) is a method of

finding the linear model which minimizes the sum of the squared errors.

– Example: (10)2 + (-2)2 + (2)2 + (-10)2 = 208

• This method is the best, linear unbiased estimator

Page 23: Simple Linear Regression Estimation and Properties

Good Spot for a break

Page 24: Simple Linear Regression Estimation and Properties

Minimizing the Sum of Squares• Our goal is to minimize the sum of the

squared errors.

• Since we have two unknowns, B1 and B2, we need to take the partial derivatives for the following equation:

Page 25: Simple Linear Regression Estimation and Properties

Partial Derivatives for B’s

• We start with our original equation:

• Now we take the partial derivatives– First equation is the partial derivative with respect to

B1,

– Second equation is with respect to B2

Page 26: Simple Linear Regression Estimation and Properties

Set Equal to Zero• Last set of equations:

• Next:

Page 27: Simple Linear Regression Estimation and Properties

The Normal Equations• Last:

• Divide both equations by –2• Multiply through• Separate summation terms and rearrange:

Page 28: Simple Linear Regression Estimation and Properties

Rewriting the Equation• Last Equation:

• We can rewrite

Page 29: Simple Linear Regression Estimation and Properties

Solving Equation• We have two equations with two unknowns, for which we

can use algebra

• Multiply first equation by sum of Xi and second by n• End up with…

Page 30: Simple Linear Regression Estimation and Properties

Subtract first equation from second and rearranging

Page 31: Simple Linear Regression Estimation and Properties

Last step• Last equation

• Multiply numerator and denominator by 1/n…recall that

• End up with

Page 32: Simple Linear Regression Estimation and Properties

We can now solve for B1

• If we go back to the first normal equation:

Page 33: Simple Linear Regression Estimation and Properties

What Does B2 Mean?

• Equation for B2 may not seem to make intuitive sense at first

• But if we break it down into pieces we can begin to see the logic

Page 34: Simple Linear Regression Estimation and Properties

In sum…

• If the changes in X are EQUAL to the changes in y, then B2 = 1

• If the changes in Y are LARGER than the changes in X, then B2 > 1

• If the changes in Y are SMALLER than the changes in X, then B2 < 1

Page 35: Simple Linear Regression Estimation and Properties

Let’s Do An Example!

Page 36: Simple Linear Regression Estimation and Properties

Calculating a and b• Mean of X is 4• Mean of Y is 12.71429

Page 37: Simple Linear Regression Estimation and Properties

Calculating B1 and B2

Page 38: Simple Linear Regression Estimation and Properties

Which Looks Like…This!Regression of Y on X

0

8

15

23

30

0 2 4 6 8

Page 39: Simple Linear Regression Estimation and Properties

Practice Problem• We have a sample of the amount of

money a each candidate spent in a state (in millions) and the percentage of the vote they received.

• Calculate the regression line and interpret.

Page 40: Simple Linear Regression Estimation and Properties

Data

State % vote Money spentCA 40 10FL 35 12GA 15 4MO 20 6OH 40 11VT 25 8

Page 41: Simple Linear Regression Estimation and Properties

Numerical Properties of OLS• Those properties that result from the method of

OLS– Expressed from observable quantities of X and Y– Point Estimator for B’s– Sample regression line passes through sample

means of Y and X– Sum of residuals is zero– Residuals are uncorrelated with the predicted Yi

– Residuals uncorrelated with Xi

Page 42: Simple Linear Regression Estimation and Properties

Assumptions of Classical Linear Regression

• A1: Linear Regression Model-Linear in parameters

• A2: X values are fixed in repeated sampling.

• A3: Zero mean value of the disturbance term ui

• A4: Homoskedasticity or Equal Variance of ui.

Page 43: Simple Linear Regression Estimation and Properties

More Assumptions• A5: No autocorrelation between disturbances

• A6: Zero covariance between ui and Xi

• A7: Number of observations n is greater than the number of parameters to be estimated

• A8: Variability in X values

Page 44: Simple Linear Regression Estimation and Properties

More Assumptions• A9: Regression model is correctly

specified.– The correct variables are included– We have the correct functional form– Correct assumptions about the probability

distributions of Yi, Xi and ui.• A10: With multiple regression, we add the

assumption of no perfect multicollinearity

Page 45: Simple Linear Regression Estimation and Properties

How “good” does it fit?

• To measure “reduction in errors” we need a benchmark for comparison.

• The mean of the dependent variable is a relevant and tractable benchmark for comparing predictions.

• The mean of Y represents our “best guess” at the value of Yi absent other information.

Page 46: Simple Linear Regression Estimation and Properties

Sums of Squares

• This gives us the following 'sum-of-squares' measures:

• Total Variation = Explained Variation + Unexplained Variation

Page 47: Simple Linear Regression Estimation and Properties

How well does our model perform?

• R squared statistic– = TSS-USS/TSS– =ESS/TSS• Bounded between 0 and 1• Higher values indicate a better fit• Lower values more unexplained than explained

variance