ordinary least square estimation

8/11/2019 Ordinary Least Square Estimation

http://slidepdf.com/reader/full/ordinary-least-square-estimation 1/30

1/30

EC114 Introduction to Quantitative Economics12. Ordinary Least Squares Estimation

Marcus Chambers

Department of EconomicsUniversity of Essex

24/26 January 2012

EC114 Introduction to Quantitative Economics 12. Ordinary Least Squares Estimation

http://find/

http://goback/



Ordinary Least Squares (OLS) Estimation 3/30

Recall that the population regression line is given by

E (Y ) = α + β X

and the sample regression line is given by

Y = a + bX

where a and b can be regarded as estimates of α and β .

Another way to think of these relationships is in terms of Y itself:

Y = α + β X + ,Y = a + bX + e,

where is the disturbance and e is the residual.


http://find/




It is clear that, if we vary the sample regression line in

some way, we will obtain a different set of residuals.

In other words, if we vary the estimation method for the

sample regression line, we will obtain a different set ofresiduals.


http://find/




The best known method of fitting a straight line to a scatterdiagram is Ordinary Least Squares (OLS).

The sample regression line is determined by the intercepta and the slope b.

A good criterion in the choice of a and b is to make theresiduals ‘small’ somehow.

Small residuals imply that the differences between theactual Y and the fitted Y are small.

The OLS method of estimation chooses a and b in order tominimize the sum of the squares of the residuals:

n

i=1

e2i = e21 + e22 + . . . + e2n.


O di L S (OLS) E i i /

http://find/

http://goback/




We know that ei = Y i − a− bX i so that the sum of squaredresiduals can be written

S =

i

e2i =

i

(Y i − a− bX i)2,

which is a function of a and b alone because Y i and X i arethe given data points.

The objective is to minimise S with respect to a and b.

To do this we need to partially differentiate S with respect toa and b and set these derivatives equal to zero:

∂ S ∂ a

= −2

(Y i − a− bX i) = 0,

∂ S

∂ b= −2

X i(Y i − a− bX i) = 0.


O di L t S (OLS) E ti ti 7/30

http://find/




As these derivatives are set equal to zero we can divideboth sides by −2 and re-arrange the terms to give:

Y i = na + b

X i,

X iY i = a

X i + b

X 2i ,

noting that

a = na.

These are known as the normal equations (but are not

related to the normal distribution).

Note that, because Y i − a− bX i = ei, we can also write the

first-order conditions in the form:

∂ S

∂ a= −2

ei = 0,

∂ S

∂ b= −2

X iei = 0.



http://find/




We therefore have two equations in two unknowns which

we can solve for a and b.The extension question on Problem Set 12 deals with this

solution.

A compact representation of the solution is:

b =

xi yi x2i

, a = Y − b¯ X ,

where xi = X i − ¯ X , yi = Y i − Y , and ¯ X and Y are the sample

means of X and Y respectively.

The above expressions for a and b are the OLS estimators

of α and β .



http://find/




We can compute a and b from various sample sums,making use of the following:

x2i =

( X i − ¯ X )2 =

X 2i −

(

X i)2

n

=

X 2i − n¯ X 2,

xi yi =

( X i −

¯ X )(Y i −

¯Y ) =

X iY i −

X i

Y i

n

=

X iY i − n¯ X Y .

In view of this another common expression for b is:

b =

X iY i −

X i

Y i

n

X 2i −(

X i)2

n

.



http://find/




The data on money stock (Y ) and GDP ( X ) in Table 9.1 ofThomas yield:

X i = 132.004,

X 2i = 1247.66,

X iY i = 220.956,

Y i = 23.718,

Y 2i = 45.154, n = 30.

Based on these quantities we obtain:

x2i =

X 2i −

(

X i)2

n

= 1247.66−132.0042

30 = 666.86,

xi yi =

X iY i −

X i

Y i

n

= 220.956−132.004× 23.718

30 = 116.60.



http://find/




The slope coefficient is therefore

b =

xi yi x2i

= 116

.60

666.86 = 0.1748.

We also find that

¯ X =

132.004

30 = 4.40

, Y =

23.718

30 = 0.7906

and so the intercept is

a = Y − b¯ X = 0.7906− (0.1748× 4.40) = 0.0212.

The sample regression line is therefore

Y = 0.021 + 0.175 X .



http://find/




Note that the residuals (the vertical distance between thedata point and the line) are larger for the countries with

larger GDP ( X ).



http://find/



y q ( )

Note, too, that the sample regression line passes through

the point ¯ X , Y .

The reason can be seen by re-arranging the equation for a,

which givesY = a + b¯ X .

This is known as the point of sample means .

In our example this point is (4.40,0.791).

Note that the intercept a > 0, although its value is small.

We had expected a relationship of the form Y = bX ,suggesting a = 0.

Although a is small we will want to know whether it issignificantly different from zero – we shall consider testingthe hypothesis that a = 0 at a later point.



http://find/



y q ( )

The value b = 0.175 means that the demand for money per

head will increase by $175 whenever GDP per head

increases by $1000.But a more interesting quantity is the income (GDP)

elasticity of the demand for money.

We can use the previous results to compute an estimate of

it – the required elasticity is given by the formula

η = dY

dX

X

Y .

However the elasticity varies along our sample regression

line because the values of X and Y vary along the line.

It is, however, common to evaluate η at the sample means

of X and Y , while dY /dX can be estimated by b.



http://find/

http://goback/



In our case the elasticity evaluated at the sample means is

η = 0.175 4.40

0.791 = 0.973.

Thus we obtain a GDP elasticity close to unity.A 1% rise in GDP per head leads to a 0.97% rise in

demand for money per head.

It would be of interest to test the hypothesis that η = 1 and

we will examine how to do this later on in the term.


Goodness-of-fit 16/30

http://find/



So far we have fitted the sample regression line

ˆY =

0

.021

+ 0

.175

X

to our scatter of points in the money-income example.

The values of the intercept (0.021) and slope (0.175) were

obtained by the method of ordinary least squares (OLS)

which chooses these values so as to minimise the sum ofsquared residuals:

i

e2i =

i

(Y i − a− bX i)2.

But we might want to ask the question: how well does our

sample regression line fit the data?

Let’s begin by taking a look at the graph:



http://find/



We can observe that the sample regression line passes‘fairly close’ to each point in the scatter, although with

greater dispersion for larger values of X (GDP per head).

We need, however, to be more precise about this; in other

words we need some sort of numerical measure ofgoodness of fit .



http://find/



We will use the coefficient of determination , R2, which is

equal to the square of the correlation coefficient R, where

R =

( X − ¯ X )(Y − Y )

( X − ¯ X )2

(Y − Y )2

We know that−1 ≤

R≤ 1

, and so it follows that 0 ≤

R

2 ≤ 1

.In our example of the demand for money we found that R = 0.8787.

Hence the coefficient of determination must therefore be R2 = (0.8787)2 = 0.772.

In regression analysis it is possible to give a preciseinterpretation to the value 0.772 obtained for R2.



http://find/



Suppose we ask the question:

What proportion of the variation in the demand for

money in our 30 countries can be attributed to the

variation in GDP?

If our sample regression line is able to explain a high

proportion of the variation in the demand for money then itmust provide a good fit to the data.

Consider the next Figure, which refers to a single samplepoint, namely France, which is observation i = 8.

We have Y 8 = 2

.3912

; ˆY 8 =

1

.6776

; and the overall samplemean is Y = 0.7906.



http://find/



The diagram shows the fitted line, the sample mean line,the residual e8 = Y 8 − Y 8, as well as the deviations Y 8 − Y

and Y 8 − Y .



http://goforward/

http://find/

http://goback/



The variations in demand for money are measured relative

to the mean.The following relationship holds:

total = variation due + residual

variation to X variation

Y 8 − Y = Y 8 − Y + e8

1.6006 = 0.8870 + 0.7136

Such a relationship holds for all points in the sample, sothat we can write

(Y i − Y ) = (Y i − Y ) + ei, i = 1, . . . , n.



http://find/



Note that these variations can be positive or negative, and

that they only apply to a single point in the sample.

However, we require an overall measure for the entire sample, and when we talk about variation we usually havein mind a positive measure.

A measure of variation of Y taken over the entire sample isthe total sum of squares (SST):

n

i=1

(Y i − Y )2.

This is the total variation in Y that we attempt to explain byour regression line, and is always non-negative.

We have seen this sort of quantity before – dividing byn− 1 gives the sample variance.



http://find/



A sample-wide measure of the variation in Y due to X isgiven by the explained sum of squares (SSE):

n

i=1

(Y i − Y )2.

This quantity is also non-negative.

Finally, a measure of the total residual variation is theresidual sum of squares (SSR):

n

i=1

e2i ,

which is also non-negative.



http://find/



The following relationship holds:

total sum of =

explained sum +

residual sum

squares of squares of squares

ni=1

(Y i − Y )2 = n

i=1(Y i − Y )2 +

ni=1

e2i

SST = SSE + SSR

The extension question on Problem Set 12 deals with thisidentity.

These quantities are used to define the coefficient ofdetermination, R2, as follows:

R2 = variation in Y due to X

total variation in Y =

SSE

SST.



http://find/



Alternative (but equivalent) expressions for R2 include

R2 = 1− SSRSST

obtained by making the substitution SSE=SST−SSR.

Another expression is

R2 = b2

x2i y2i

where xi = X i − ¯ X and yi = Y i − Y .

The derivation of this last expression requires showing thatSSE= b2

x2i and noting that SST=

y2i (see the

extension question on Problem Set 12).



http://find/



In our demand for money example we have already shown

that R2 = 0.772 by squaring the correlation coefficient

R = 0.8787

.However, we know that

b = 0.17485,

x2i = 666.86,

y2i = 26.403,

and hence an alternative derivation is

R2 = b2

x2i y2i

= 0.174852 × 666.86

26.403 = 0.772.

This implies that just over 77% of the variation in thedemand for money can be attributed to the variation inGDP.

Remember: 0 ≤ R2 ≤ 1.



http://find/



In (a) and (c) R2 = 1 because all points lie on a singlesample line – in (a) R = +1 and in (c) R = −1.

In (b) R = 0 due to the lack of association between the twovariables and hence R2 = 0.



http://find/



The correlation coefficient, R, is a measure of the strengthof association between two variables, and says nothing

about the direction of causation (if any exists).

The coefficient of determination, R2, however, is based on

the regression model Y = α + β X + in which the

causation is assumed to go from X to Y .

However, we should be careful to refer to R2 as thepercentage of the variation in Y attributed to X rather thanexplained by X , because any such relationship could be

spurious.


Computing OLS Estimates 29/30

http://find/



In practice we use computer software for OLS calculations.

As an example, the Stata output for the money demandexample is of the form:

. regress m g

Source | SS df MS Number of obs = 30

-------------+------------------------------ F( 1, 28) = 94.88

Model | 20.3862321 1 20.3862321 Prob > F = 0.0000

Residual | 6.01600434 28 .214857298 R-squared = 0.7721

-------------+------------------------------ Adj R-squared = 0.7640

Total | 26.4022364 29 .910421946 Root MSE = .46353

------------------------------------------------------------------------------

m | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

g | .1748489 .0179502 9.74 0.000 .1380795 .2116182

_cons | .0212579 .1157594 0.18 0.856 -.2158645 .2583803

------------------------------------------------------------------------------

Quite a lot of information is provided by default, but note

that the estimates a and b are given at the start of the finaltwo rows (under the heading ‘Coef.’).


Summary 30/30

Summary

http://find/



Summary

Ordinary Least Squares (OLS) Estimation

Goodness-of-fit

Next week:

Non-Linear Models


http://find/

ordinary least square estimation

Documents