11 multiple regression
DESCRIPTION
SPSSTRANSCRIPT
Applied Statistics-2 for the Students of
“Executive program in Business Analytics and Business Intelligence”
Organized by
IIM Ranchi Edited By: Dr. K. Maddulety, NITIE, Mumbai, Mail: [email protected]
• Using Statistics• The k-Variable Multiple Regression Model• The F Test of a Multiple Regression Model• How Good is the Regression• Tests of the Significance of Individual Regression
Parameters• Testing the Validity of the Regression Model• Using the Multiple Regression Model for
Prediction
Multiple Regression (1)11
• Qualitative Independent Variables• Polynomial Regression• Nonlinear Models and Transformations• Multicollinearity• Residual Autocorrelation and the Durbin-Watson
Test• Partial F Tests and Variable Selection Methods• The Matrix Approach to Multiple Regression
Analysis• Summary and Review of Terms
Multiple Regression (2)11
Slope: 1
Intercept: 0
Any two points (A and B), or an intercept and slope (0 and 1), define a line on a two-dimensional surface.
B
A
x
y
x2
x1
y
C
A
B
Any three points (A, B, and C), or an intercept and coefficients of x1 and x2 (0 , 1, and 2), define a plane in a three-dimensional surface.
Lines Planes
11-1 Using Statistics
y x x 0 1 1 2 2
The population regression model of a dependent variable, Y, on a set of k independent variables, X1, X2,. . . , Xk is given by:
Y= 0 + 1X1 + 2X2 + . . . + kXk +
where 0 is the Y-intercept of the regression surface and each i , i = 1,2,...,k is the slope of the regression surface - sometimes called the response surface - with respect to Xi.
The population regression model of a dependent variable, Y, on a set of k independent variables, X1, X2,. . . , Xk is given by:
Y= 0 + 1X1 + 2X2 + . . . + kXk +
where 0 is the Y-intercept of the regression surface and each i , i = 1,2,...,k is the slope of the regression surface - sometimes called the response surface - with respect to Xi.
x2
x1
y2
10
Model assumptions:1. ~N(0,2), independent of other errors.2. The variables Xi are uncorrelated with the error term.
Model assumptions:1. ~N(0,2), independent of other errors.2. The variables Xi are uncorrelated with the error term.
11-2 The k-Variable Multiple Regression Model
In a simple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression line.
In a simple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression line.
In a multiple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression plane.
In a multiple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression plane.
X
Y
x2
x1
y
y b b x 0 1y b b x b x 0 1 1 2 2
Simple and Multiple Least-Squares Regression
The estimated regression relationship:
where is the predicted value of Y, the value lying on the estimated regression surface. The terms b0,...,k are the least-squares estimates of the population regression parameters i.
The estimated regression relationship:
where is the predicted value of Y, the value lying on the estimated regression surface. The terms b0,...,k are the least-squares estimates of the population regression parameters i.
Y b b X b X b Xk k 0 1 1 2 2
Y
The actual, observed value of Y is the predicted value plus an error: yj = b0+ b1 x1j+ b2 x2j+. . . + bk xkj+e
The actual, observed value of Y is the predicted value plus an error: yj = b0+ b1 x1j+ b2 x2j+. . . + bk xkj+e
The Estimated Regression Relationship
2
22211202
212
2
11101
22110
xbxxbxbyx
xxbxbxbyx
xbxbnby
Minimizing the sum of squared errors with respect to the estimated coefficients b0, b1, and b2 yields the following normal equations:
Minimizing the sum of squared errors with respect to the estimated coefficients b0, b1, and b2 yields the following normal equations:
Least-Squares Estimation: The 2-Variable Normal Equations
Y X1 X2 X1X2 X12 X2
2 X1Y X2Y 72 12 5 60 144 25 864 360 76 11 8 88 121 64 836 608 78 15 6 90 225 36 1170 468 70 10 5 50 100 25 700 350 68 11 3 33 121 9 748 204 80 16 9 144 256 81 1280 720 82 14 12 168 196 144 1148 984 65 8 4 32 64 16 520 260 62 8 3 24 64 9 496 186 90 18 10 180 324 100 1620 900--- --- --- --- ---- --- ---- ----743 123 65 869 1615 509 9382 5040
Y X1 X2 X1X2 X12 X2
2 X1Y X2Y 72 12 5 60 144 25 864 360 76 11 8 88 121 64 836 608 78 15 6 90 225 36 1170 468 70 10 5 50 100 25 700 350 68 11 3 33 121 9 748 204 80 16 9 144 256 81 1280 720 82 14 12 168 196 144 1148 984 65 8 4 32 64 16 520 260 62 8 3 24 64 9 496 186 90 18 10 180 324 100 1620 900--- --- --- --- ---- --- ---- ----743 123 65 869 1615 509 9382 5040
Normal Equations:
743 = 10b0+123b1+65b2
9382 = 123b0+1615b1+869b2
5040 = 65b0+869b1+509b2
b0 = 47.164942b1 = 1.5990404b2 = 1.1487479
Normal Equations:
743 = 10b0+123b1+65b2
9382 = 123b0+1615b1+869b2
5040 = 65b0+869b1+509b2
b0 = 47.164942b1 = 1.5990404b2 = 1.1487479
Estimated regression equation:
. . .Y X X 47164942 15990404 114874791 2
Example 11-1
Example 11-1: Using the Template
Regression results for Alka-Seltzer sales
Total Deviation = Regression Deviation + Error Deviation SST = SSR + SSE
Total Deviation = Regression Deviation + Error Deviation SST = SSR + SSE
x2
x1
y
y
Y Y : Error Deviation
Y Y : Regression DeviationTotal deviation: Y Y
Decomposition of the Total Deviation in a Multiple Regression Model
A statistical test for the existence of a linear relationship between Y and any or all of the independent variables X1, x2, ..., Xk:
H0: 1 = 2 = ...= k=0H1: Not all the i (i=1,2,...,k) are 0
A statistical test for the existence of a linear relationship between Y and any or all of the independent variables X1, x2, ..., Xk:
H0: 1 = 2 = ...= k=0H1: Not all the i (i=1,2,...,k) are 0
Source of Variation
Sum of Squares
Degrees of Freedom
Mean Square
F Ratio
Regression SSR k
Error SSE n - (k+1)
Total SST n-1
MSRSSR
k
MSESSE
n k
( ( ))1
MSTSST
n
( )1
11-3 The F Test of a Multiple Regression Model
The test statistic, F = 86.34, is greater than the critical point of F(2, 7) for any common level of significance(p-value 0), so the null hypothesis is rejected, and we might conclude that the dependent variable is related to one or more of the independent variables.
The test statistic, F = 86.34, is greater than the critical point of F(2, 7) for any common level of significance(p-value 0), so the null hypothesis is rejected, and we might conclude that the dependent variable is related to one or more of the independent variables.
0F
F Distribution with 2 and 7 Degrees of Freedom
F0.01=9.55
=0.01
Test statistic 86.34f(F)
Using the Template: Analysis of Variance Table (Example 11-1)
The multiple coefficient of determination, R2 , measures the proportion ofthe variation in the dependent variable that is explained by the combinationof the independent variables in the multiple regression model:
=SSRSST
= 1-SSESST
R2
The is an unbiasedestimator of the variance of the population
errors, denoted by 2
:
=
mean square error
Standard error of estimate
, :
( ( ))
( )( ( ))
MSESSE
n k
y y
n k
s MSE
1
2
1
x2
x1
y
Errors: y - y
11-4 How Good is the Regression
The , R2
, is the coefficient ofdetermination with the SSE and SST divided by their respective degrees of freedom:
= 1 -
SSE
(n - (k + 1))
SST
(n - 1)
adjusted multiple coefficient of determination
R2
SST
SSESSR
=SSR
SST= 1 -
SSE
SSTR
2
Example 11-1: s = 1.911 R-sq = 96.1% R-sq(adj) = 95.0%Example 11-1: s = 1.911 R-sq = 96.1% R-sq(adj) = 95.0%
Decomposition of the Sum of Squares and the Adjusted Coefficient of Determination
Source ofVariation
Sum ofSquares
Degrees ofFreedom Mean Square F Ratio
Regression SSR (k)
Error SSE (n-(k+1))=(n-k-1)
Total SST (n-1)
MSRSSR
k
MSESSE
n k
( ( ))1
MSTSST
n
( )1
FMSR
MSE
=SSR
SST= 1 -
SSE
SSTR
2 = 1 -
SSE
(n - (k + 1))
SST
(n - 1)
=MSE
MSTR
2FR
R
n k
k
2
12
1
( )
( ( ))
( )
Measures of Performance in Multiple Regression and the ANOVA Table
Hypothesis tests about individual regression slope parameters:(1) H0: b1= 0
H1: b1 0(2) H0: b2 = 0
H1: b2 0 . . .
(k) H0: bk = 0H1: bk 0
Hypothesis tests about individual regression slope parameters:(1) H0: b1= 0
H1: b1 0(2) H0: b2 = 0
H1: b2 0 . . .
(k) H0: bk = 0H1: bk 0
Test statistic for test i tb
s bn k
i
i
:( )( ( )
1
0
11-5 Tests of the Significance of Individual Regression Parameters
VariableCoefficientEstimate
StandardError t-Statistic
Constant 53.12 5.43 9.783 *X1 2.03 0.22 9.227 *X2 5.60 1.30 4.308 *X3 10.35 6.88 1.504
X4 3.45 2.70 1.259
X5 -4.25 0.38 11.184 *n=150 t0.025=1.96
Regression Results for Individual Parameters
Example 11-1: Using the Template
Regression results for Alka-Seltzer sales
Using the Template: Example 11-2
Regression results for Exports to Singapore
11-6 Testing the Validity of the Regression Model: Residual Plots
Residuals vs M1
It appears that the residuals are randomly distributed with no pattern and with equal variance as M1 increases
11-6 Testing the Validity of the Regression Model: Residual Plots
Residuals vs Price
It appears that the residuals are increasing as the Price increases. The variance of the residuals is not constant.
Normal Probability Plot for the Residuals: Example 11-2
Linear trend indicates residuals are normally distributed
.
.
.
...
.
.
....
... .
* Outlier
y
x
Regression line without outlier
Regression line with outlier
OutliersOutliers
... .... ... ..
. . .
Point with a large value of xi
y
x
*
Regression line when all data are included
No relationship in this cluster
Influential ObservationsInfluential Observations
Investigating the Validity of the Regression: Outliers and Influential Observations
Unusual ObservationsObs. M1 EXPORTS Fit Stdev.Fit Residual St.Resid 1 5.10 2.6000 2.6420 0.1288 -0.0420 -0.14 X 2 4.90 2.6000 2.6438 0.1234 -0.0438 -0.14 X 25 6.20 5.5000 4.5949 0.0676 0.9051 2.80R 26 6.30 3.7000 4.6311 0.0651 -0.9311 -2.87R 50 8.30 4.3000 5.1317 0.0648 -0.8317 -2.57R 67 8.20 5.6000 4.9474 0.0668 0.6526 2.02R
R denotes an obs. with a large st. resid.X denotes an obs. whose X value gives it large influence.
Unusual ObservationsObs. M1 EXPORTS Fit Stdev.Fit Residual St.Resid 1 5.10 2.6000 2.6420 0.1288 -0.0420 -0.14 X 2 4.90 2.6000 2.6438 0.1234 -0.0438 -0.14 X 25 6.20 5.5000 4.5949 0.0676 0.9051 2.80R 26 6.30 3.7000 4.6311 0.0651 -0.9311 -2.87R 50 8.30 4.3000 5.1317 0.0648 -0.8317 -2.57R 67 8.20 5.6000 4.9474 0.0668 0.6526 2.02R
R denotes an obs. with a large st. resid.X denotes an obs. whose X value gives it large influence.
Outliers and Influential Observations: Example 11-2
Sales
Advertising
Promotions
8.00
18.00
312
63.42
89.76
Estimated Regression Plane for Example 11-1Estimated Regression Plane for Example 11-1
11-7 Using the Multiple Regression Model for Prediction
A (1 - a) 100% prediction interval for a value of Y given values of Xi:
A (1 - a) 100% prediction interval for the conditional mean of Y givenvalues of Xi:
( )
[ ( )]
( ,( ( )))
( ,( ( )))
y t s y MSE
y t s E Y
n k
n k
21
2
21
Prediction in Multiple Regression
MOVIEEARN COST PROM BOOK 1 28 4.2 1.0 0 2 35 6.0 3.0 1 3 50 5.5 6.0 1 4 20 3.3 1.0 0 5 75 12.5 11.0 1 6 60 9.6 8.0 1 7 15 2.5 0.5 0 8 45 10.8 5.0 0 9 50 8.4 3.0 1 10 34 6.6 2.0 0 11 48 10.7 1.0 1 12 82 11.0 15.0 1 13 24 3.5 4.0 0 14 50 6.9 10.0 0 15 58 7.8 9.0 1 16 63 10.1 10.0 0 17 30 5.0 1.0 1 18 37 7.5 5.0 0 19 45 6.4 8.0 1 20 72 10.0 12.0 1
MOVIEEARN COST PROM BOOK 1 28 4.2 1.0 0 2 35 6.0 3.0 1 3 50 5.5 6.0 1 4 20 3.3 1.0 0 5 75 12.5 11.0 1 6 60 9.6 8.0 1 7 15 2.5 0.5 0 8 45 10.8 5.0 0 9 50 8.4 3.0 1 10 34 6.6 2.0 0 11 48 10.7 1.0 1 12 82 11.0 15.0 1 13 24 3.5 4.0 0 14 50 6.9 10.0 0 15 58 7.8 9.0 1 16 63 10.1 10.0 0 17 30 5.0 1.0 1 18 37 7.5 5.0 0 19 45 6.4 8.0 1 20 72 10.0 12.0 1
An indicator (dummy, binary) variable of qualitative level A:
if level A is obtained
if level A is not obtainedX h
1
0
11-8 Qualitative (or Categorical) Independent Variables (in
Regression)
EXAMPLE 11-3
A multiple regression with two quantitative variables (X1 and X2) and one qualitative variable (X3):
A multiple regression with two quantitative variables (X1 and X2) and one qualitative variable (X3):
A regression with one quantitative variable (X1) and one qualitative variable (X2):
A regression with one quantitative variable (X1) and one qualitative variable (X2):
X1
Y
Line for X2=1
Line for X2=0
b0
b0+b2
x2
x1
y
b3
y b b x b x 0 1 1 2 2
y b b x b x b x 0 1 1 2 2 3 3
Picturing Qualitative Variables in Regression
b0 X1
YLine for X = 0 and X3 = 1
A regression with one quantitative variable (X1) and two qualitative variables (X2 and X2):
A regression with one quantitative variable (X1) and two qualitative variables (X2 and X2):
b0+b2
b0+b3
Line for X2 = 1 and X3 = 0
Line for X2 = 0 and X3 = 0
A qualitative variable with r levels or categories is represented with (r-1) 0/1 (dummy) variables.
A qualitative variable with r levels or categories is represented with (r-1) 0/1 (dummy) variables.
Category X2 X3
Adventure 0 0Drama 0 1Romance 1 0
Category X2 X3
Adventure 0 0Drama 0 1Romance 1 0y b b x b x b x
0 1 1 2 2 3 3
Picturing Qualitative Variables in Regression: Three Categories and Two Dummy Variables
Salary = 8547 + 949 Education + 1258 Experience - 3256 Gender (SE) (32.6) (45.1) (78.5) (212.4) (t) (262.2) (21.0) (16.0) (-15.3)
Salary = 8547 + 949 Education + 1258 Experience - 3256 Gender (SE) (32.6) (45.1) (78.5) (212.4) (t) (262.2) (21.0) (16.0) (-15.3) On average, female salaries are $3256
below male salariesOn average, female salaries are $3256 below male salariesGender
if Female
if Male
1
0
Using Qualitative Variables in Regression: Example 11-4
A regression with interaction between a quantitative variable (X1) and a qualitative variable (X2 ):
A regression with interaction between a quantitative variable (X1) and a qualitative variable (X2 ):
X1
YLine for X2=0
b0+b2
b0
Line for X2=1Slope = b1
Slope = b1+b3
y b b x b x b x x 0 1 1 2 2 3 1 2
Interactions between Quantitative and Qualitative Variables: Shifting
Slopes
One-variable polynomial regression model:Y=0+1 X + 2X2 + 3X3 +. . . + mXm +
where m is the degree of the polynomial - the highest power of X appearing in the equation. The degree of the polynomial is the order of the model.
One-variable polynomial regression model:Y=0+1 X + 2X2 + 3X3 +. . . + mXm +
where m is the degree of the polynomial - the highest power of X appearing in the equation. The degree of the polynomial is the order of the model.
X1
Y
X1
Y
y b b X 0 1
( )
y b b X b X
b
0 1 2
2
20
y b b X 0 1
y b b X b X b X 0 1 2
2
3
3
11-9 Polynomial Regression
Polynomial Regression: Example 11-5
Variable Estimate Standard Error T-statistic X1 2.34 0.92 2.54 X2 3.11 1.05 2.96 X1
2 4.22 1.00 4.22 X2
2 3.57 2.12 1.68 X1X2 2.77 2.30 1.20
Variable Estimate Standard Error T-statistic X1 2.34 0.92 2.54 X2 3.11 1.05 2.96 X1
2 4.22 1.00 4.22 X2
2 3.57 2.12 1.68 X1X2 2.77 2.30 1.20
Polynomial Regression: Other Variables and Cross-Product Terms
The
Y X X X
The
Y X X X
:
multiplicative model
logarithmic transformation
:
log log log log log log
0 1 2 3
0 1 1 2 2 3 3
1 2 3
11-10 Nonlinear Models and Transformations: Multiplicative
Model
The
Y e
The
Y X
X
:
exponential model
logarithmic transformation
:
log log log
0
0 1 1
1
Transformations: Exponential Model
151050
30
20
10
ADVERT
SA
LES
Simple Regression of Sales on Advertising
3210
3.5
2.5
1.5
LOGADV
LOG
SA
LE
Regression of Log(Sales) on Log(Advertising)
R- Sq uared = 0 .8 9 5
Y = 6 .59 2 71 + 1.19 176 X
R-Sq uared = 0 .9 47
Y = 1.70 0 82 + 0 .553 13 6 X
3210
25
15
5
LOGADV
SA
LES
R-Sq uared = 0 .978
Y = 3.6 682 5 + 6 .784 X
Regression of Sales on Log(Advertising)
22122
1.5
0.5
-0.5
-1.5
Y-HAT
RE
SID
S
Residual Plots: Sales vs Log(Advertising)
Plots of Transformed Variables
• Square root transformation:Useful when the variance of the regression errors is approximately
proportional to the conditional mean of Y
• Logarithmic transformation:Useful when the variance of regression errors is approximately
proportional to the square of the conditional mean of Y
• Reciprocal transformation:Useful when the variance of the regression errors is approximately
proportional to the fourth power of the conditional mean of Y
• Square root transformation:Useful when the variance of the regression errors is approximately
proportional to the conditional mean of Y
• Logarithmic transformation:Useful when the variance of regression errors is approximately
proportional to the square of the conditional mean of Y
• Reciprocal transformation:Useful when the variance of the regression errors is approximately
proportional to the fourth power of the conditional mean of Y
Y Y
Y Ylog( )
YY
1
Variance Stabilizing Transformations
E Y Xee
ppp
X
X( )
log
( )
( )
0 1
0 11
1
y
x
1
0
Logistic Function
The logistic function:
Transformation to linearize the logistic function:
Regression with Dependent Indicator Variables
x2
x1
Orthogonal X variables provide information from independent sources. No multicollinearity.
x2 x1
Perfectly collinear X variables provide identical information content. No regression.
Some degree of collinearity. Problems with regression depend on the degree of collinearity.
x2
x1
A high degree of negative collinearity also causes problems with regression.
x2x1
11-11: Multicollinearity
• Variances of regression coefficients are inflated.• Magnitudes of regression coefficients may be different from what are
expected.• Signs of regression coefficients may not be as expected.• Adding or removing variables produces large changes in coefficients.• Removing a data point may cause large changes in coefficient estimates or
signs.• In some cases, the F ratio may be significant while the t ratios are not.
• Variances of regression coefficients are inflated.• Magnitudes of regression coefficients may be different from what are
expected.• Signs of regression coefficients may not be as expected.• Adding or removing variables produces large changes in coefficients.• Removing a data point may cause large changes in coefficient estimates or
signs.• In some cases, the F ratio may be significant while the t ratios are not.
Effects of Multicollinearity
Detecting the Existence of Multicollinearity: Correlation Matrix of Independent Variables and Variance Inflation Factors
1.00.50.0
100
50
0Rh
2
VIF
Relationship between VIF and Rh2
The associated with
where R is the value obtained for the regression of X on the other independent variables.
h
2 2
variance inflation factor X
VIF XR
R
h
hh
:
( ) 1
1 2
Variance Inflation Factor
Variance Inflation Factor (VIF)
Observation: The VIF (Variance Inflation Factor) values for both variables Lend and Price are both greater than 5. This would indicate that some degree of multicollinearity exists with respect to these two variables.
• Drop a collinear variable from the regression
• Change in sampling plan to include elements outside the multicollinearity range
• Transformations of variables• Ridge regression
• Drop a collinear variable from the regression
• Change in sampling plan to include elements outside the multicollinearity range
• Transformations of variables• Ridge regression
Solutions to the Multicollinearity Problem
An autocorrelation is a correlation of the values of a variable with values of the same variable lagged one or more periods back. Consequences of autocorrelation include inaccurate estimates of variances and inaccurate predictions.
An autocorrelation is a correlation of the values of a variable with values of the same variable lagged one or more periods back. Consequences of autocorrelation include inaccurate estimates of variances and inaccurate predictions.
Lagged Residuals
i i i-1 i-2 i-3 i-4 1 1.0 * * * * 2 0.0 1.0 * * * 3 -1.0 0.0 1.0 * * 4 2.0 -1.0 0.0 1.0 * 5 3.0 2.0 -1.0 0.0 1.0 6 -2.0 3.0 2.0 -1.0 0.0 7 1.0 -2.0 3.0 2.0 -1.0 8 1.5 1.0 -2.0 3.0 2.0 9 1.0 1.5 1.0 -2.0 3.010 -2.5 1.0 1.5 1.0 -2.0
Lagged Residuals
i i i-1 i-2 i-3 i-4 1 1.0 * * * * 2 0.0 1.0 * * * 3 -1.0 0.0 1.0 * * 4 2.0 -1.0 0.0 1.0 * 5 3.0 2.0 -1.0 0.0 1.0 6 -2.0 3.0 2.0 -1.0 0.0 7 1.0 -2.0 3.0 2.0 -1.0 8 1.5 1.0 -2.0 3.0 2.0 9 1.0 1.5 1.0 -2.0 3.010 -2.5 1.0 1.5 1.0 -2.0
The Durbin-Watson test (first-order autocorrelation): H0: 1 = 0 H1: 0The Durbin-Watson test statistic:
The Durbin-Watson test (first-order autocorrelation): H0: 1 = 0 H1: 0The Durbin-Watson test statistic:
dei eii
n
eii
n
( )12
22
1
11-12 Residual Autocorrelation and the Durbin-Watson Test
k = 1 k = 2 k = 3 k = 4 k = 5 n dL dU dL dU dL dU dL dU dL dU
15 1.08 1.36 0.95 1.54 0.82 1.75 0.69 1.97 0.56 2.21 16 1.10 1.37 0.98 1.54 0.86 1.73 0.74 1.93 0.62 2.15 17 1.13 1.38 1.02 1.54 0.90 1.71 0.78 1.90 0.67 2.10 18 1.16 1.39 1.05 1.53 0.93 1.69 0.82 1.87 0.71 2.06 . . . . . . . . . . . . . . . . . . 65 1.57 1.63 1.54 1.66 1.50 1.70 1.47 1.73 1.44 1.77 70 1.58 1.64 1.55 1.67 1.52 1.70 1.49 1.74 1.46 1.77 75 1.60 1.65 1.57 1.68 1.54 1.71 1.51 1.74 1.49 1.77 80 1.61 1.66 1.59 1.69 1.56 1.72 1.53 1.74 1.51 1.77 85 1.62 1.67 1.60 1.70 1.57 1.72 1.55 1.75 1.52 1.77 90 1.63 1.68 1.61 1.70 1.59 1.73 1.57 1.75 1.54 1.78 95 1.64 1.69 1.62 1.71 1.60 1.73 1.58 1.75 1.56 1.78100 1.65 1.69 1.63 1.72 1.61 1.74 1.59 1.76 1.57 1.78
k = 1 k = 2 k = 3 k = 4 k = 5 n dL dU dL dU dL dU dL dU dL dU
15 1.08 1.36 0.95 1.54 0.82 1.75 0.69 1.97 0.56 2.21 16 1.10 1.37 0.98 1.54 0.86 1.73 0.74 1.93 0.62 2.15 17 1.13 1.38 1.02 1.54 0.90 1.71 0.78 1.90 0.67 2.10 18 1.16 1.39 1.05 1.53 0.93 1.69 0.82 1.87 0.71 2.06 . . . . . . . . . . . . . . . . . . 65 1.57 1.63 1.54 1.66 1.50 1.70 1.47 1.73 1.44 1.77 70 1.58 1.64 1.55 1.67 1.52 1.70 1.49 1.74 1.46 1.77 75 1.60 1.65 1.57 1.68 1.54 1.71 1.51 1.74 1.49 1.77 80 1.61 1.66 1.59 1.69 1.56 1.72 1.53 1.74 1.51 1.77 85 1.62 1.67 1.60 1.70 1.57 1.72 1.55 1.75 1.52 1.77 90 1.63 1.68 1.61 1.70 1.59 1.73 1.57 1.75 1.54 1.78 95 1.64 1.69 1.62 1.71 1.60 1.73 1.58 1.75 1.56 1.78100 1.65 1.69 1.63 1.72 1.61 1.74 1.59 1.76 1.57 1.78
Critical Points of the Durbin-Watson Statistic: =0.05, n= Sample Size, k = Number of Independent Variables
PositiveAutocorrelation
NegativeAutocorrelation
Test isInconclusive
NoAutocorrelation
Test isInconclusive
0 dL dU 4-dL4-dU 4
For n = 67, k = 4: dU1.73 4-dU2.27 dL1.47 4- dL2.53 < 2.58
H0 is rejected, and we conclude there is negative first-order autocorrelation.
For n = 67, k = 4: dU1.73 4-dU2.27 dL1.47 4- dL2.53 < 2.58
H0 is rejected, and we conclude there is negative first-order autocorrelation.
Using the Durbin-Watson Statistic
Full model:Y = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 +
Reduced model:Y = 0 + 1 X1 + 2 X2 +
Partial F test:H0: 3 = 4 = 0H1: 3 and 4 not both 0
Partial F statistic:
where SSER is the sum of squared errors of the reduced model, SSEF is the sum of squared errors of the full model; MSEF is the mean square error of the full model [MSEF = SSEF/(n-(k+1))]; r is the number of variables dropped from the full model.
Full model:Y = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 +
Reduced model:Y = 0 + 1 X1 + 2 X2 +
Partial F test:H0: 3 = 4 = 0H1: 3 and 4 not both 0
Partial F statistic:
where SSER is the sum of squared errors of the reduced model, SSEF is the sum of squared errors of the full model; MSEF is the mean square error of the full model [MSEF = SSEF/(n-(k+1))]; r is the number of variables dropped from the full model.
F(r, (n (k 1))
(SSER
SSEF
) / r
MSEF
11-13 Partial F Tests and Variable Selection Methods
• All possible regressionsRun regressions with all possible combinations of
independent variables and select best model
• All possible regressionsRun regressions with all possible combinations of
independent variables and select best model
Variable Selection Methods
A p-value of 0.001 indicates that we should reject the null hypothesis H0: the slopes for Lend and Exch. are zero.
• Stepwise proceduresForward selection
• Add one variable at a time to the model, on the basis of its F statistic
Backward elimination• Remove one variable at a time, on the basis of its F statistic
Stepwise regression• Adds variables to the model and subtracts variables from the model,
on the basis of the F statistic
• Stepwise proceduresForward selection
• Add one variable at a time to the model, on the basis of its F statistic
Backward elimination• Remove one variable at a time, on the basis of its F statistic
Stepwise regression• Adds variables to the model and subtracts variables from the model,
on the basis of the F statistic
Variable Selection Methods
Compute F statistic for each variable not in the model
Enter most significant (smallest p-value) variable into model
Calculate partial F for all variables in the model
Is there a variable with p-value > Pout?Removevariable
Stop
Yes
NoIs there at least one variable with p-value > Pin?
No
Stepwise Regression
MTB > STEPWISE 'EXPORTS' PREDICTORS 'M1’ 'LEND' 'PRICE’ 'EXCHANGE'
Stepwise Regression
F-to-Enter: 4.00 F-to-Remove: 4.00
Response is EXPORTS on 4 predictors, with N = 67
Step 1 2Constant 0.9348 -3.4230
M1 0.520 0.361T-Ratio 9.89 9.21
PRICE 0.0370T-Ratio 9.05
S 0.495 0.331R-Sq 60.08 82.48
MTB > STEPWISE 'EXPORTS' PREDICTORS 'M1’ 'LEND' 'PRICE’ 'EXCHANGE'
Stepwise Regression
F-to-Enter: 4.00 F-to-Remove: 4.00
Response is EXPORTS on 4 predictors, with N = 67
Step 1 2Constant 0.9348 -3.4230
M1 0.520 0.361T-Ratio 9.89 9.21
PRICE 0.0370T-Ratio 9.05
S 0.495 0.331R-Sq 60.08 82.48
Stepwise Regression: Using the Computer (MINITAB)
MTB > REGRESS 'EXPORTS’ 4 'M1’ 'LEND’ 'PRICE' 'EXCHANGE';SUBC> vif;SUBC> dw.Regression AnalysisThe regression equation isEXPORTS = - 4.02 + 0.368 M1 + 0.0047 LEND + 0.0365 PRICE + 0.27 EXCHANGE
Predictor Coef Stdev t-ratio p VIFConstant -4.015 2.766 -1.45 0.152M1 0.36846 0.06385 5.77 0.000 3.2LEND 0.00470 0.04922 0.10 0.924 5.4PRICE 0.036511 0.009326 3.91 0.000 6.3EXCHANGE 0.268 1.175 0.23 0.820 1.4
s = 0.3358 R-sq = 82.5% R-sq(adj) = 81.4% Analysis of Variance
SOURCE DF SS MS F pRegression 4 32.9463 8.2366 73.06 0.000Error 62 6.9898 0.1127Total 66 39.9361
Durbin-Watson statistic = 2.58
MTB > REGRESS 'EXPORTS’ 4 'M1’ 'LEND’ 'PRICE' 'EXCHANGE';SUBC> vif;SUBC> dw.Regression AnalysisThe regression equation isEXPORTS = - 4.02 + 0.368 M1 + 0.0047 LEND + 0.0365 PRICE + 0.27 EXCHANGE
Predictor Coef Stdev t-ratio p VIFConstant -4.015 2.766 -1.45 0.152M1 0.36846 0.06385 5.77 0.000 3.2LEND 0.00470 0.04922 0.10 0.924 5.4PRICE 0.036511 0.009326 3.91 0.000 6.3EXCHANGE 0.268 1.175 0.23 0.820 1.4
s = 0.3358 R-sq = 82.5% R-sq(adj) = 81.4% Analysis of Variance
SOURCE DF SS MS F pRegression 4 32.9463 8.2366 73.06 0.000Error 62 6.9898 0.1127Total 66 39.9361
Durbin-Watson statistic = 2.58
Using the Computer: MINITAB
Parameter Estimates
Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > |T|
INTERCEP 1 -4.015461 2.76640057 -1.452 0.1517 M1 1 0.368456 0.06384841 5.771 0.0001 LEND 1 0.004702 0.04922186 0.096
0.9242 PRICE 1 0.036511 0.00932601 3.915
0.0002 EXCHANGE 1 0.267896 1.17544016 0.228 0.8205
Variance Variable DF Inflation
INTERCEP 1 0.00000000 M1 1 3.20719533 LEND 1 5.35391367 PRICE 1 6.28873181 EXCHANGE 1 1.38570639
Durbin-Watson D 2.583(For Number of Obs.) 671st Order Autocorrelation -0.321
Parameter Estimates
Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > |T|
INTERCEP 1 -4.015461 2.76640057 -1.452 0.1517 M1 1 0.368456 0.06384841 5.771 0.0001 LEND 1 0.004702 0.04922186 0.096
0.9242 PRICE 1 0.036511 0.00932601 3.915
0.0002 EXCHANGE 1 0.267896 1.17544016 0.228 0.8205
Variance Variable DF Inflation
INTERCEP 1 0.00000000 M1 1 3.20719533 LEND 1 5.35391367 PRICE 1 6.28873181 EXCHANGE 1 1.38570639
Durbin-Watson D 2.583(For Number of Obs.) 671st Order Autocorrelation -0.321
Using the Computer: SAS (continued)
The population regression
y
y
y
y
x x x x
x x x x
x x x x
x x x xk
k
k
k
n n n nk
.
.
.
..
model:
.
.
.
...
...
...
. . . . .
. . . . .
. . . . .
.
1
2
3
11 12 13 1
21 22 23 2
31 32 33 3
1 2 3
1
2
1
1
1
1
3
1
2
3
.
.
.
.
.
.
k k
Y X
The estimated regression
model:
Y = Xb+e
11-15: The Matrix Approach to Regression Analysis (1)
The normal equations
X Xb X Y
Estimators
b X X X Y
values
Y Xb X X X X Y HY
V b X X
s b MSE X X
:
:
( )
:
( )
( ) ( )
( ) ( )
1
1
2 1
2 1
Predicted
The Matrix Approach to Regression Analysis (2)