11 multiple regression

Applied Statistics-2 for the Students of

“Executive program in Business Analytics and Business Intelligence”

Organized by

IIM Ranchi Edited By: Dr. K. Maddulety, NITIE, Mumbai, Mail: [email protected]

mailto:[email protected]

• Using Statistics• The k-Variable Multiple Regression Model• The F Test of a Multiple Regression Model• How Good is the Regression• Tests of the Significance of Individual Regression

Parameters• Testing the Validity of the Regression Model• Using the Multiple Regression Model for

Prediction

Multiple Regression (1)11

• Qualitative Independent Variables• Polynomial Regression• Nonlinear Models and Transformations• Multicollinearity• Residual Autocorrelation and the Durbin-Watson

Test• Partial F Tests and Variable Selection Methods• The Matrix Approach to Multiple Regression

Analysis• Summary and Review of Terms

Multiple Regression (2)11

Slope: 1

Intercept: 0

Any two points (A and B), or an intercept and slope (0 and 1), define a line on a two-dimensional surface.

B

A

x

y

x2

x1

y

C

A

B

Any three points (A, B, and C), or an intercept and coefficients of x1 and x2 (0 , 1, and 2), define a plane in a three-dimensional surface.

Lines Planes

11-1 Using Statistics

y x x 0 1 1 2 2

The population regression model of a dependent variable, Y, on a set of k independent variables, X1, X2,. . . , Xk is given by:

Y= 0 + 1X1 + 2X2 + . . . + kXk +

where 0 is the Y-intercept of the regression surface and each i , i = 1,2,...,k is the slope of the regression surface - sometimes called the response surface - with respect to Xi.

The population regression model of a dependent variable, Y, on a set of k independent variables, X1, X2,. . . , Xk is given by:

Y= 0 + 1X1 + 2X2 + . . . + kXk +

where 0 is the Y-intercept of the regression surface and each i , i = 1,2,...,k is the slope of the regression surface - sometimes called the response surface - with respect to Xi.

x2

x1

y2

10

Model assumptions:1. ~N(0,2), independent of other errors.2. The variables Xi are uncorrelated with the error term.

Model assumptions:1. ~N(0,2), independent of other errors.2. The variables Xi are uncorrelated with the error term.

11-2 The k-Variable Multiple Regression Model

In a simple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression line.

In a simple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression line.

In a multiple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression plane.

In a multiple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression plane.

X

Y

x2

x1

y

y b b x 0 1y b b x b x 0 1 1 2 2

Simple and Multiple Least-Squares Regression

The estimated regression relationship:

where is the predicted value of Y, the value lying on the estimated regression surface. The terms b0,...,k are the least-squares estimates of the population regression parameters i.

The estimated regression relationship:

where is the predicted value of Y, the value lying on the estimated regression surface. The terms b0,...,k are the least-squares estimates of the population regression parameters i.

Y b b X b X b Xk k 0 1 1 2 2

Y

The actual, observed value of Y is the predicted value plus an error: yj = b0+ b1 x1j+ b2 x2j+. . . + bk xkj+e

The actual, observed value of Y is the predicted value plus an error: yj = b0+ b1 x1j+ b2 x2j+. . . + bk xkj+e

The Estimated Regression Relationship

2

22211202

212

2

11101

22110

xbxxbxbyx

xxbxbxbyx

xbxbnby

Minimizing the sum of squared errors with respect to the estimated coefficients b0, b1, and b2 yields the following normal equations:

Minimizing the sum of squared errors with respect to the estimated coefficients b0, b1, and b2 yields the following normal equations:

Least-Squares Estimation: The 2-Variable Normal Equations

Y X1 X2 X1X2 X12 X2

2 X1Y X2Y 72 12 5 60 144 25 864 360 76 11 8 88 121 64 836 608 78 15 6 90 225 36 1170 468 70 10 5 50 100 25 700 350 68 11 3 33 121 9 748 204 80 16 9 144 256 81 1280 720 82 14 12 168 196 144 1148 984 65 8 4 32 64 16 520 260 62 8 3 24 64 9 496 186 90 18 10 180 324 100 1620 900--- --- --- --- ---- --- ---- ----743 123 65 869 1615 509 9382 5040

Y X1 X2 X1X2 X12 X2

2 X1Y X2Y 72 12 5 60 144 25 864 360 76 11 8 88 121 64 836 608 78 15 6 90 225 36 1170 468 70 10 5 50 100 25 700 350 68 11 3 33 121 9 748 204 80 16 9 144 256 81 1280 720 82 14 12 168 196 144 1148 984 65 8 4 32 64 16 520 260 62 8 3 24 64 9 496 186 90 18 10 180 324 100 1620 900--- --- --- --- ---- --- ---- ----743 123 65 869 1615 509 9382 5040

Normal Equations:

743 = 10b0+123b1+65b2

9382 = 123b0+1615b1+869b2

5040 = 65b0+869b1+509b2

b0 = 47.164942b1 = 1.5990404b2 = 1.1487479

Normal Equations:

743 = 10b0+123b1+65b2

9382 = 123b0+1615b1+869b2

5040 = 65b0+869b1+509b2

b0 = 47.164942b1 = 1.5990404b2 = 1.1487479

Estimated regression equation:

. . .Y X X 47164942 15990404 114874791 2

Example 11-1

Example 11-1: Using the Template

Regression results for Alka-Seltzer sales

Total Deviation = Regression Deviation + Error Deviation SST = SSR + SSE

Total Deviation = Regression Deviation + Error Deviation SST = SSR + SSE

x2

x1

y

y

Y Y : Error Deviation

Y Y : Regression DeviationTotal deviation: Y Y

Decomposition of the Total Deviation in a Multiple Regression Model

A statistical test for the existence of a linear relationship between Y and any or all of the independent variables X1, x2, ..., Xk:

H0: 1 = 2 = ...= k=0H1: Not all the i (i=1,2,...,k) are 0

A statistical test for the existence of a linear relationship between Y and any or all of the independent variables X1, x2, ..., Xk:

H0: 1 = 2 = ...= k=0H1: Not all the i (i=1,2,...,k) are 0

Source of Variation

Sum of Squares

Degrees of Freedom

Mean Square

F Ratio

Regression SSR k

Error SSE n - (k+1)

Total SST n-1

MSRSSR

k

MSESSE

n k

( ( ))1

MSTSST

n

( )1

11-3 The F Test of a Multiple Regression Model

The test statistic, F = 86.34, is greater than the critical point of F(2, 7) for any common level of significance(p-value 0), so the null hypothesis is rejected, and we might conclude that the dependent variable is related to one or more of the independent variables.

The test statistic, F = 86.34, is greater than the critical point of F(2, 7) for any common level of significance(p-value 0), so the null hypothesis is rejected, and we might conclude that the dependent variable is related to one or more of the independent variables.

0F

F Distribution with 2 and 7 Degrees of Freedom

F0.01=9.55

=0.01

Test statistic 86.34f(F)

Using the Template: Analysis of Variance Table (Example 11-1)

The multiple coefficient of determination, R2 , measures the proportion ofthe variation in the dependent variable that is explained by the combinationof the independent variables in the multiple regression model:

=SSRSST

= 1-SSESST

R2

The is an unbiasedestimator of the variance of the population

errors, denoted by 2

:

=

mean square error

Standard error of estimate

, :

( ( ))

( )( ( ))

MSESSE

n k

y y

n k

s MSE

1

2

1

x2

x1

y

Errors: y - y

11-4 How Good is the Regression

The , R2

, is the coefficient ofdetermination with the SSE and SST divided by their respective degrees of freedom:

= 1 -

SSE

(n - (k + 1))

SST

(n - 1)

adjusted multiple coefficient of determination

R2

SST

SSESSR

=SSR

SST= 1 -

SSE

SSTR

2

Example 11-1: s = 1.911 R-sq = 96.1% R-sq(adj) = 95.0%Example 11-1: s = 1.911 R-sq = 96.1% R-sq(adj) = 95.0%

Decomposition of the Sum of Squares and the Adjusted Coefficient of Determination

Source ofVariation

Sum ofSquares

Degrees ofFreedom Mean Square F Ratio

Regression SSR (k)

Error SSE (n-(k+1))=(n-k-1)

Total SST (n-1)

MSRSSR

k

MSESSE

n k

( ( ))1

MSTSST

n

( )1

FMSR

MSE

=SSR

SST= 1 -

SSE

SSTR

2 = 1 -

SSE

(n - (k + 1))

SST

(n - 1)

=MSE

MSTR

2FR

R

n k

k

2

12

1

( )

( ( ))

( )

Measures of Performance in Multiple Regression and the ANOVA Table

Hypothesis tests about individual regression slope parameters:(1) H0: b1= 0

H1: b1 0(2) H0: b2 = 0

H1: b2 0 . . .

(k) H0: bk = 0H1: bk 0

Hypothesis tests about individual regression slope parameters:(1) H0: b1= 0

H1: b1 0(2) H0: b2 = 0

H1: b2 0 . . .

(k) H0: bk = 0H1: bk 0

Test statistic for test i tb

s bn k

i

i

:( )( ( )

1

0

11-5 Tests of the Significance of Individual Regression Parameters

VariableCoefficientEstimate

StandardError t-Statistic

Constant 53.12 5.43 9.783 *X1 2.03 0.22 9.227 *X2 5.60 1.30 4.308 *X3 10.35 6.88 1.504

X4 3.45 2.70 1.259

X5 -4.25 0.38 11.184 *n=150 t0.025=1.96

Regression Results for Individual Parameters

Example 11-1: Using the Template

Regression results for Alka-Seltzer sales

Using the Template: Example 11-2

Regression results for Exports to Singapore

11-6 Testing the Validity of the Regression Model: Residual Plots

Residuals vs M1

It appears that the residuals are randomly distributed with no pattern and with equal variance as M1 increases

11-6 Testing the Validity of the Regression Model: Residual Plots

Residuals vs Price

It appears that the residuals are increasing as the Price increases. The variance of the residuals is not constant.

Normal Probability Plot for the Residuals: Example 11-2

Linear trend indicates residuals are normally distributed

.

.

.

...

.

.

....

... .

* Outlier

y

x

Regression line without outlier

Regression line with outlier

OutliersOutliers

... .... ... ..

. . .

Point with a large value of xi

y

x

*

Regression line when all data are included

No relationship in this cluster

Influential ObservationsInfluential Observations

Investigating the Validity of the Regression: Outliers and Influential Observations

Unusual ObservationsObs. M1 EXPORTS Fit Stdev.Fit Residual St.Resid 1 5.10 2.6000 2.6420 0.1288 -0.0420 -0.14 X 2 4.90 2.6000 2.6438 0.1234 -0.0438 -0.14 X 25 6.20 5.5000 4.5949 0.0676 0.9051 2.80R 26 6.30 3.7000 4.6311 0.0651 -0.9311 -2.87R 50 8.30 4.3000 5.1317 0.0648 -0.8317 -2.57R 67 8.20 5.6000 4.9474 0.0668 0.6526 2.02R

R denotes an obs. with a large st. resid.X denotes an obs. whose X value gives it large influence.

Unusual ObservationsObs. M1 EXPORTS Fit Stdev.Fit Residual St.Resid 1 5.10 2.6000 2.6420 0.1288 -0.0420 -0.14 X 2 4.90 2.6000 2.6438 0.1234 -0.0438 -0.14 X 25 6.20 5.5000 4.5949 0.0676 0.9051 2.80R 26 6.30 3.7000 4.6311 0.0651 -0.9311 -2.87R 50 8.30 4.3000 5.1317 0.0648 -0.8317 -2.57R 67 8.20 5.6000 4.9474 0.0668 0.6526 2.02R

R denotes an obs. with a large st. resid.X denotes an obs. whose X value gives it large influence.

Outliers and Influential Observations: Example 11-2

Sales

Advertising

Promotions

8.00

18.00

312

63.42

89.76

Estimated Regression Plane for Example 11-1Estimated Regression Plane for Example 11-1

11-7 Using the Multiple Regression Model for Prediction

A (1 - a) 100% prediction interval for a value of Y given values of Xi:

A (1 - a) 100% prediction interval for the conditional mean of Y givenvalues of Xi:

( )

[ ( )]

( ,( ( )))

( ,( ( )))

y t s y MSE

y t s E Y

n k

n k

21

2

21

Prediction in Multiple Regression

MOVIEEARN COST PROM BOOK 1 28 4.2 1.0 0 2 35 6.0 3.0 1 3 50 5.5 6.0 1 4 20 3.3 1.0 0 5 75 12.5 11.0 1 6 60 9.6 8.0 1 7 15 2.5 0.5 0 8 45 10.8 5.0 0 9 50 8.4 3.0 1 10 34 6.6 2.0 0 11 48 10.7 1.0 1 12 82 11.0 15.0 1 13 24 3.5 4.0 0 14 50 6.9 10.0 0 15 58 7.8 9.0 1 16 63 10.1 10.0 0 17 30 5.0 1.0 1 18 37 7.5 5.0 0 19 45 6.4 8.0 1 20 72 10.0 12.0 1

MOVIEEARN COST PROM BOOK 1 28 4.2 1.0 0 2 35 6.0 3.0 1 3 50 5.5 6.0 1 4 20 3.3 1.0 0 5 75 12.5 11.0 1 6 60 9.6 8.0 1 7 15 2.5 0.5 0 8 45 10.8 5.0 0 9 50 8.4 3.0 1 10 34 6.6 2.0 0 11 48 10.7 1.0 1 12 82 11.0 15.0 1 13 24 3.5 4.0 0 14 50 6.9 10.0 0 15 58 7.8 9.0 1 16 63 10.1 10.0 0 17 30 5.0 1.0 1 18 37 7.5 5.0 0 19 45 6.4 8.0 1 20 72 10.0 12.0 1

An indicator (dummy, binary) variable of qualitative level A:

if level A is obtained

if level A is not obtainedX h

1

0

11-8 Qualitative (or Categorical) Independent Variables (in

Regression)

EXAMPLE 11-3

A multiple regression with two quantitative variables (X1 and X2) and one qualitative variable (X3):

A multiple regression with two quantitative variables (X1 and X2) and one qualitative variable (X3):

A regression with one quantitative variable (X1) and one qualitative variable (X2):

A regression with one quantitative variable (X1) and one qualitative variable (X2):

X1

Y

Line for X2=1

Line for X2=0

b0

b0+b2

x2

x1

y

b3

y b b x b x 0 1 1 2 2

y b b x b x b x 0 1 1 2 2 3 3

Picturing Qualitative Variables in Regression

b0 X1

YLine for X = 0 and X3 = 1

A regression with one quantitative variable (X1) and two qualitative variables (X2 and X2):

A regression with one quantitative variable (X1) and two qualitative variables (X2 and X2):

b0+b2

b0+b3

Line for X2 = 1 and X3 = 0

Line for X2 = 0 and X3 = 0

A qualitative variable with r levels or categories is represented with (r-1) 0/1 (dummy) variables.

A qualitative variable with r levels or categories is represented with (r-1) 0/1 (dummy) variables.

Category X2 X3

Adventure 0 0Drama 0 1Romance 1 0

Category X2 X3

Adventure 0 0Drama 0 1Romance 1 0y b b x b x b x

0 1 1 2 2 3 3

Picturing Qualitative Variables in Regression: Three Categories and Two Dummy Variables

Salary = 8547 + 949 Education + 1258 Experience - 3256 Gender (SE) (32.6) (45.1) (78.5) (212.4) (t) (262.2) (21.0) (16.0) (-15.3)

Salary = 8547 + 949 Education + 1258 Experience - 3256 Gender (SE) (32.6) (45.1) (78.5) (212.4) (t) (262.2) (21.0) (16.0) (-15.3) On average, female salaries are $3256

below male salariesOn average, female salaries are $3256 below male salariesGender

if Female

if Male

1

0

Using Qualitative Variables in Regression: Example 11-4

A regression with interaction between a quantitative variable (X1) and a qualitative variable (X2 ):

A regression with interaction between a quantitative variable (X1) and a qualitative variable (X2 ):

X1

YLine for X2=0

b0+b2

b0

Line for X2=1Slope = b1

Slope = b1+b3

y b b x b x b x x 0 1 1 2 2 3 1 2

Interactions between Quantitative and Qualitative Variables: Shifting

Slopes

One-variable polynomial regression model:Y=0+1 X + 2X2 + 3X3 +. . . + mXm +

where m is the degree of the polynomial - the highest power of X appearing in the equation. The degree of the polynomial is the order of the model.

One-variable polynomial regression model:Y=0+1 X + 2X2 + 3X3 +. . . + mXm +

where m is the degree of the polynomial - the highest power of X appearing in the equation. The degree of the polynomial is the order of the model.

X1

Y

X1

Y

y b b X 0 1

( )

y b b X b X

b

0 1 2

2

20

y b b X 0 1

y b b X b X b X 0 1 2

2

3

3

11-9 Polynomial Regression

Polynomial Regression: Example 11-5

Variable Estimate Standard Error T-statistic X1 2.34 0.92 2.54 X2 3.11 1.05 2.96 X1

2 4.22 1.00 4.22 X2

2 3.57 2.12 1.68 X1X2 2.77 2.30 1.20

Variable Estimate Standard Error T-statistic X1 2.34 0.92 2.54 X2 3.11 1.05 2.96 X1

2 4.22 1.00 4.22 X2

2 3.57 2.12 1.68 X1X2 2.77 2.30 1.20

Polynomial Regression: Other Variables and Cross-Product Terms

The

Y X X X

The

Y X X X

:

multiplicative model

logarithmic transformation

:

log log log log log log

0 1 2 3

0 1 1 2 2 3 3

1 2 3

11-10 Nonlinear Models and Transformations: Multiplicative

Model

The

Y e

The

Y X

X

:

exponential model

logarithmic transformation

:

log log log

0

0 1 1

1

Transformations: Exponential Model

151050

30

20

10

ADVERT

SA

LES

Simple Regression of Sales on Advertising

3210

3.5

2.5

1.5

LOGADV

LOG

SA

LE

Regression of Log(Sales) on Log(Advertising)

R- Sq uared = 0 .8 9 5

Y = 6 .59 2 71 + 1.19 176 X

R-Sq uared = 0 .9 47

Y = 1.70 0 82 + 0 .553 13 6 X

3210

25

15

5

LOGADV

SA

LES

R-Sq uared = 0 .978

Y = 3.6 682 5 + 6 .784 X

Regression of Sales on Log(Advertising)

22122

1.5

0.5

-0.5

-1.5

Y-HAT

RE

SID

S

Residual Plots: Sales vs Log(Advertising)

Plots of Transformed Variables

• Square root transformation:Useful when the variance of the regression errors is approximately

proportional to the conditional mean of Y

• Logarithmic transformation:Useful when the variance of regression errors is approximately

proportional to the square of the conditional mean of Y

• Reciprocal transformation:Useful when the variance of the regression errors is approximately

proportional to the fourth power of the conditional mean of Y

• Square root transformation:Useful when the variance of the regression errors is approximately

proportional to the conditional mean of Y

• Logarithmic transformation:Useful when the variance of regression errors is approximately

proportional to the square of the conditional mean of Y

• Reciprocal transformation:Useful when the variance of the regression errors is approximately

proportional to the fourth power of the conditional mean of Y

Y Y

Y Ylog( )

YY

1

Variance Stabilizing Transformations

E Y Xee

ppp

X

X( )

log

( )

( )

0 1

0 11

1

y

x

1

0

Logistic Function

The logistic function:

Transformation to linearize the logistic function:

Regression with Dependent Indicator Variables

x2

x1

Orthogonal X variables provide information from independent sources. No multicollinearity.

x2 x1

Perfectly collinear X variables provide identical information content. No regression.

Some degree of collinearity. Problems with regression depend on the degree of collinearity.

x2

x1

A high degree of negative collinearity also causes problems with regression.

x2x1

11-11: Multicollinearity

• Variances of regression coefficients are inflated.• Magnitudes of regression coefficients may be different from what are

expected.• Signs of regression coefficients may not be as expected.• Adding or removing variables produces large changes in coefficients.• Removing a data point may cause large changes in coefficient estimates or

signs.• In some cases, the F ratio may be significant while the t ratios are not.

• Variances of regression coefficients are inflated.• Magnitudes of regression coefficients may be different from what are

expected.• Signs of regression coefficients may not be as expected.• Adding or removing variables produces large changes in coefficients.• Removing a data point may cause large changes in coefficient estimates or

signs.• In some cases, the F ratio may be significant while the t ratios are not.

Effects of Multicollinearity

Detecting the Existence of Multicollinearity: Correlation Matrix of Independent Variables and Variance Inflation Factors

1.00.50.0

100

50

0Rh

2

VIF

Relationship between VIF and Rh2

The associated with

where R is the value obtained for the regression of X on the other independent variables.

h

2 2

variance inflation factor X

VIF XR

R

h

hh

:

( ) 1

1 2

Variance Inflation Factor

Variance Inflation Factor (VIF)

Observation: The VIF (Variance Inflation Factor) values for both variables Lend and Price are both greater than 5. This would indicate that some degree of multicollinearity exists with respect to these two variables.

• Drop a collinear variable from the regression

• Change in sampling plan to include elements outside the multicollinearity range

• Transformations of variables• Ridge regression

• Drop a collinear variable from the regression

• Change in sampling plan to include elements outside the multicollinearity range

• Transformations of variables• Ridge regression

Solutions to the Multicollinearity Problem

An autocorrelation is a correlation of the values of a variable with values of the same variable lagged one or more periods back. Consequences of autocorrelation include inaccurate estimates of variances and inaccurate predictions.

An autocorrelation is a correlation of the values of a variable with values of the same variable lagged one or more periods back. Consequences of autocorrelation include inaccurate estimates of variances and inaccurate predictions.

Lagged Residuals

i i i-1 i-2 i-3 i-4 1 1.0 * * * * 2 0.0 1.0 * * * 3 -1.0 0.0 1.0 * * 4 2.0 -1.0 0.0 1.0 * 5 3.0 2.0 -1.0 0.0 1.0 6 -2.0 3.0 2.0 -1.0 0.0 7 1.0 -2.0 3.0 2.0 -1.0 8 1.5 1.0 -2.0 3.0 2.0 9 1.0 1.5 1.0 -2.0 3.010 -2.5 1.0 1.5 1.0 -2.0

Lagged Residuals

i i i-1 i-2 i-3 i-4 1 1.0 * * * * 2 0.0 1.0 * * * 3 -1.0 0.0 1.0 * * 4 2.0 -1.0 0.0 1.0 * 5 3.0 2.0 -1.0 0.0 1.0 6 -2.0 3.0 2.0 -1.0 0.0 7 1.0 -2.0 3.0 2.0 -1.0 8 1.5 1.0 -2.0 3.0 2.0 9 1.0 1.5 1.0 -2.0 3.010 -2.5 1.0 1.5 1.0 -2.0

The Durbin-Watson test (first-order autocorrelation): H0: 1 = 0 H1: 0The Durbin-Watson test statistic:

The Durbin-Watson test (first-order autocorrelation): H0: 1 = 0 H1: 0The Durbin-Watson test statistic:

dei eii

n

eii

n

( )12

22

1

11-12 Residual Autocorrelation and the Durbin-Watson Test

k = 1 k = 2 k = 3 k = 4 k = 5 n dL dU dL dU dL dU dL dU dL dU

15 1.08 1.36 0.95 1.54 0.82 1.75 0.69 1.97 0.56 2.21 16 1.10 1.37 0.98 1.54 0.86 1.73 0.74 1.93 0.62 2.15 17 1.13 1.38 1.02 1.54 0.90 1.71 0.78 1.90 0.67 2.10 18 1.16 1.39 1.05 1.53 0.93 1.69 0.82 1.87 0.71 2.06 . . . . . . . . . . . . . . . . . . 65 1.57 1.63 1.54 1.66 1.50 1.70 1.47 1.73 1.44 1.77 70 1.58 1.64 1.55 1.67 1.52 1.70 1.49 1.74 1.46 1.77 75 1.60 1.65 1.57 1.68 1.54 1.71 1.51 1.74 1.49 1.77 80 1.61 1.66 1.59 1.69 1.56 1.72 1.53 1.74 1.51 1.77 85 1.62 1.67 1.60 1.70 1.57 1.72 1.55 1.75 1.52 1.77 90 1.63 1.68 1.61 1.70 1.59 1.73 1.57 1.75 1.54 1.78 95 1.64 1.69 1.62 1.71 1.60 1.73 1.58 1.75 1.56 1.78100 1.65 1.69 1.63 1.72 1.61 1.74 1.59 1.76 1.57 1.78

k = 1 k = 2 k = 3 k = 4 k = 5 n dL dU dL dU dL dU dL dU dL dU

15 1.08 1.36 0.95 1.54 0.82 1.75 0.69 1.97 0.56 2.21 16 1.10 1.37 0.98 1.54 0.86 1.73 0.74 1.93 0.62 2.15 17 1.13 1.38 1.02 1.54 0.90 1.71 0.78 1.90 0.67 2.10 18 1.16 1.39 1.05 1.53 0.93 1.69 0.82 1.87 0.71 2.06 . . . . . . . . . . . . . . . . . . 65 1.57 1.63 1.54 1.66 1.50 1.70 1.47 1.73 1.44 1.77 70 1.58 1.64 1.55 1.67 1.52 1.70 1.49 1.74 1.46 1.77 75 1.60 1.65 1.57 1.68 1.54 1.71 1.51 1.74 1.49 1.77 80 1.61 1.66 1.59 1.69 1.56 1.72 1.53 1.74 1.51 1.77 85 1.62 1.67 1.60 1.70 1.57 1.72 1.55 1.75 1.52 1.77 90 1.63 1.68 1.61 1.70 1.59 1.73 1.57 1.75 1.54 1.78 95 1.64 1.69 1.62 1.71 1.60 1.73 1.58 1.75 1.56 1.78100 1.65 1.69 1.63 1.72 1.61 1.74 1.59 1.76 1.57 1.78

Critical Points of the Durbin-Watson Statistic: =0.05, n= Sample Size, k = Number of Independent Variables

PositiveAutocorrelation

NegativeAutocorrelation

Test isInconclusive

NoAutocorrelation

Test isInconclusive

0 dL dU 4-dL4-dU 4

For n = 67, k = 4: dU1.73 4-dU2.27 dL1.47 4- dL2.53 < 2.58

H0 is rejected, and we conclude there is negative first-order autocorrelation.

For n = 67, k = 4: dU1.73 4-dU2.27 dL1.47 4- dL2.53 < 2.58

H0 is rejected, and we conclude there is negative first-order autocorrelation.

Using the Durbin-Watson Statistic

Full model:Y = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 +

Reduced model:Y = 0 + 1 X1 + 2 X2 +

Partial F test:H0: 3 = 4 = 0H1: 3 and 4 not both 0

Partial F statistic:

where SSER is the sum of squared errors of the reduced model, SSEF is the sum of squared errors of the full model; MSEF is the mean square error of the full model [MSEF = SSEF/(n-(k+1))]; r is the number of variables dropped from the full model.

Full model:Y = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 +

Reduced model:Y = 0 + 1 X1 + 2 X2 +

Partial F test:H0: 3 = 4 = 0H1: 3 and 4 not both 0

Partial F statistic:

where SSER is the sum of squared errors of the reduced model, SSEF is the sum of squared errors of the full model; MSEF is the mean square error of the full model [MSEF = SSEF/(n-(k+1))]; r is the number of variables dropped from the full model.

F(r, (n (k 1))

(SSER

SSEF

) / r

MSEF

11-13 Partial F Tests and Variable Selection Methods

• All possible regressionsRun regressions with all possible combinations of

independent variables and select best model

• All possible regressionsRun regressions with all possible combinations of

independent variables and select best model

Variable Selection Methods

A p-value of 0.001 indicates that we should reject the null hypothesis H0: the slopes for Lend and Exch. are zero.

• Stepwise proceduresForward selection

• Add one variable at a time to the model, on the basis of its F statistic

Backward elimination• Remove one variable at a time, on the basis of its F statistic

Stepwise regression• Adds variables to the model and subtracts variables from the model,

on the basis of the F statistic

• Stepwise proceduresForward selection

• Add one variable at a time to the model, on the basis of its F statistic

Backward elimination• Remove one variable at a time, on the basis of its F statistic

Stepwise regression• Adds variables to the model and subtracts variables from the model,

on the basis of the F statistic

Variable Selection Methods

Compute F statistic for each variable not in the model

Enter most significant (smallest p-value) variable into model

Calculate partial F for all variables in the model

Is there a variable with p-value > Pout?Removevariable

Stop

Yes

NoIs there at least one variable with p-value > Pin?

No

Stepwise Regression

MTB > STEPWISE 'EXPORTS' PREDICTORS 'M1’ 'LEND' 'PRICE’ 'EXCHANGE'

Stepwise Regression

F-to-Enter: 4.00 F-to-Remove: 4.00

Response is EXPORTS on 4 predictors, with N = 67

Step 1 2Constant 0.9348 -3.4230

M1 0.520 0.361T-Ratio 9.89 9.21

PRICE 0.0370T-Ratio 9.05

S 0.495 0.331R-Sq 60.08 82.48

MTB > STEPWISE 'EXPORTS' PREDICTORS 'M1’ 'LEND' 'PRICE’ 'EXCHANGE'

Stepwise Regression

F-to-Enter: 4.00 F-to-Remove: 4.00

Response is EXPORTS on 4 predictors, with N = 67

Step 1 2Constant 0.9348 -3.4230

M1 0.520 0.361T-Ratio 9.89 9.21

PRICE 0.0370T-Ratio 9.05

S 0.495 0.331R-Sq 60.08 82.48

Stepwise Regression: Using the Computer (MINITAB)

MTB > REGRESS 'EXPORTS’ 4 'M1’ 'LEND’ 'PRICE' 'EXCHANGE';SUBC> vif;SUBC> dw.Regression AnalysisThe regression equation isEXPORTS = - 4.02 + 0.368 M1 + 0.0047 LEND + 0.0365 PRICE + 0.27 EXCHANGE

Predictor Coef Stdev t-ratio p VIFConstant -4.015 2.766 -1.45 0.152M1 0.36846 0.06385 5.77 0.000 3.2LEND 0.00470 0.04922 0.10 0.924 5.4PRICE 0.036511 0.009326 3.91 0.000 6.3EXCHANGE 0.268 1.175 0.23 0.820 1.4

s = 0.3358 R-sq = 82.5% R-sq(adj) = 81.4% Analysis of Variance

SOURCE DF SS MS F pRegression 4 32.9463 8.2366 73.06 0.000Error 62 6.9898 0.1127Total 66 39.9361

Durbin-Watson statistic = 2.58

MTB > REGRESS 'EXPORTS’ 4 'M1’ 'LEND’ 'PRICE' 'EXCHANGE';SUBC> vif;SUBC> dw.Regression AnalysisThe regression equation isEXPORTS = - 4.02 + 0.368 M1 + 0.0047 LEND + 0.0365 PRICE + 0.27 EXCHANGE

Predictor Coef Stdev t-ratio p VIFConstant -4.015 2.766 -1.45 0.152M1 0.36846 0.06385 5.77 0.000 3.2LEND 0.00470 0.04922 0.10 0.924 5.4PRICE 0.036511 0.009326 3.91 0.000 6.3EXCHANGE 0.268 1.175 0.23 0.820 1.4

s = 0.3358 R-sq = 82.5% R-sq(adj) = 81.4% Analysis of Variance

SOURCE DF SS MS F pRegression 4 32.9463 8.2366 73.06 0.000Error 62 6.9898 0.1127Total 66 39.9361

Durbin-Watson statistic = 2.58

Using the Computer: MINITAB

Parameter Estimates

Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > |T|

INTERCEP 1 -4.015461 2.76640057 -1.452 0.1517 M1 1 0.368456 0.06384841 5.771 0.0001 LEND 1 0.004702 0.04922186 0.096

0.9242 PRICE 1 0.036511 0.00932601 3.915

0.0002 EXCHANGE 1 0.267896 1.17544016 0.228 0.8205

Variance Variable DF Inflation

INTERCEP 1 0.00000000 M1 1 3.20719533 LEND 1 5.35391367 PRICE 1 6.28873181 EXCHANGE 1 1.38570639

Durbin-Watson D 2.583(For Number of Obs.) 671st Order Autocorrelation -0.321

Parameter Estimates

Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > |T|

INTERCEP 1 -4.015461 2.76640057 -1.452 0.1517 M1 1 0.368456 0.06384841 5.771 0.0001 LEND 1 0.004702 0.04922186 0.096

0.9242 PRICE 1 0.036511 0.00932601 3.915

0.0002 EXCHANGE 1 0.267896 1.17544016 0.228 0.8205

Variance Variable DF Inflation

INTERCEP 1 0.00000000 M1 1 3.20719533 LEND 1 5.35391367 PRICE 1 6.28873181 EXCHANGE 1 1.38570639

Durbin-Watson D 2.583(For Number of Obs.) 671st Order Autocorrelation -0.321

Using the Computer: SAS (continued)

The population regression

y

y

y

y

x x x x

x x x x

x x x x

x x x xk

k

k

k

n n n nk

.

.

.

..

model:

.

.

.

...

...

...

. . . . .

. . . . .

. . . . .

.

1

2

3

11 12 13 1

21 22 23 2

31 32 33 3

1 2 3

1

2

1

1

1

1

3

1

2

3

.

.

.

.

.

.

k k

Y X

The estimated regression

model:

Y = Xb+e

11-15: The Matrix Approach to Regression Analysis (1)

The normal equations

X Xb X Y

Estimators

b X X X Y

values

Y Xb X X X X Y HY

V b X X

s b MSE X X

:

:

( )

:

( )

( ) ( )

( ) ( )

1

1

2 1

2 1

Predicted

The Matrix Approach to Regression Analysis (2)

11 multiple regression

Documents