regression analysis part b calculation procedures
Post on 29-Jan-2016
34 Views
Preview:
DESCRIPTION
TRANSCRIPT
RegressionAnalysis
Part BCalculation Procedures
Read Chapters 3, 4 and 5of Forecasting and Time Series, An Applied Approach.
L01B MGS 8110 - Regression - Calculations 2
Part A – Basic Model & Parameter Estimation
Part B – Calculation Procedures
Part C – Inference: Confidence Intervals & Hypothesis Testing
Part D – Goodness of Fit
Part E – Model Building
Part F – Transformed Variables
Part G – Standardized Variables
Part H – Dummy Variables
Part I – Eliminating Intercept
Part J - Outliers
Part K – Regression Example #1
Part L – Regression Example #2
Part N – Non-linear Regression
Part P – Non-linear Example
Regression Analysis Modules
L01B MGS 8110 - Regression - Calculations 3
Alternative Calculation Procedures
- Manual - use Excel and type in the formulas and intermediate steps.
- Use the Data Analysis option of Excel.
- Use SPSS statistical software program.
L01B MGS 8110 - Regression - Calculations 4
Univariate Regression Data
Quarter R D Sales Quarter R D Sales1 9.25 40 15 29.25 1032 12.50 37 16 32.75 1173 17.50 50 17 30.00 1314 20.00 70 18 28.00 985 15.00 60 19 33.50 1126 18.00 60 20 38.25 1347 22.00 72 21 32.00 1538 25.25 88 22 25.25 1459 15.00 101 23 22.25 101
10 20.25 80 24 25.00 8911 24.25 81 25 26.25 9012 27.50 97 26 31.25 10513 25.00 110 27 30.00 12514 25.75 89 28 40.50 145
L01B MGS 8110 - Regression - Calculations 5
Manual CalculationsUnivariate Case
123456789101112131415161718192021222324252627282930313233
A B C D E F G H I J KX Y
Quarter R D Sales1 9.25 40 370.00 85.56 numerator = 5,175.02 =D31-(28)*(B32)*(C32)
2 12.50 37 462.50 156.25 denomiator = 1,495.92 =E31-(28)*(B32)^2
3 17.50 50 875.00 306.254 20.00 70 1,400.00 400.00 b = 3.46 =H3/H4
5 15.00 60 900.00 225.006 18.00 60 1,080.00 324.00 a = 9.15 =C32-(H6)*(B32)
7 22.00 72 1,584.00 484.008 25.25 88 2,222.00 637.569 15.00 101 1,515.00 225.0010 20.25 80 1,620.00 410.0611 24.25 81 1,964.25 588.0612 27.50 97 2,667.50 756.2513 25.00 110 2,750.00 625.0014 25.75 89 2,291.75 663.0615 29.25 103 3,012.75 855.5616 32.75 117 3,831.75 1,072.5617 30.00 131 3,930.00 900.0018 28.00 98 2,744.00 784.0019 33.50 112 3,752.00 1,122.2520 38.25 134 5,125.50 1,463.0621 32.00 153 4,896.00 1,024.0022 25.25 145 3,661.25 637.5623 22.25 101 2,247.25 495.0624 25.00 89 2,225.00 625.0025 26.25 90 2,362.50 689.0626 31.25 105 3,281.25 976.56 =B28^2
27 30.00 125 3,750.00 900.0028 40.50 145 5,872.50 1,640.25
Sum's = 701.50 2,683.00 72,393.75 19,071.00Mean's = 25.05 95.82 =SUM(E3:E30)
=B31/28
XY X2
xb-y a
b
ˆˆ
22ˆ
xnix
yxniyix
L01B MGS 8110 - Regression - Calculations 6
Excel, Data Analysis Calculations Univariate Case
X Y
Quarter R D Sales TOOLS / DATA ANALYSIS / Regression1 9.25 402 12.50 373 17.50 504 20.00 705 15.00 606 18.00 60 Regression Statistics7 22.00 728 25.25 889 15.00 10110 20.25 8011 24.25 8112 27.50 9713 25.00 11014 25.75 8915 29.25 10316 32.75 11717 30.00 13118 28.00 9819 33.50 11220 38.25 13421 32.00 15322 25.25 14523 22.25 10124 25.00 8925 26.25 9026 31.25 10527 30.00 12528 40.50 145
L01B MGS 8110 - Regression - Calculations 7
Excel, Data Analysis Calculations Univariate Case
(continued)
SUMMARY OUTPUTRegression Statistics
Multiple R 0.831R Square 0.691Adjusted R Square 0.680Standard Error 17.532Observations 28
ANOVAdf SS MS F Significance F
Regression 1 17902.6 17902.6 58.2 4.2298E-08Residual 26 7991.5 307.4Total 27 25894.1
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0%Upper 95.0%Intercept 9.15 11.83 0.77 0.45 -15.17 33.47 -15.17 33.47R D 3.46 0.45 7.63 0.00 2.53 4.39 2.53 4.39
L01B MGS 8110 - Regression - Calculations 8
SPSS Data Analysis Calculations Univariate Case
SPSS: Analyze/Regression/Linear/
L01B MGS 8110 - Regression - Calculations 9
SPSS Data Analysis Calculations Univariate Case
(continued)Model Summary
.831a .691 .680 17.532Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), R Da.
ANOVAb
17902.572 1 17902.572 58.245 .000a
7991.535 26 307.367
25894.107 27
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), R Da.
Dependent Variable: SALESb.
Coefficientsa
9.151 11.830 .774 .446
3.459 .453 .831 7.632 .000 1.000 1.000
(Constant)
R D
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig. Tolerance VIF
Collinearity Statistics
Dependent Variable: SALESa.
L01B MGS 8110 - Regression - Calculations 10
Multivariate Regression Data
House Price Size Age1 68.70 2.05 3.432 54.90 1.70 11.613 51.50 1.47 8.314 71.60 1.75 0.005 58.40 1.94 7.416 40.70 1.19 31.707 51.70 1.56 16.108 71.90 1.95 2.059 57.10 1.60 1.7410 58.30 1.49 2.7611 73.50 1.91 0.0012 58.50 1.38 0.0013 49.10 1.55 12.6114 67.50 1.88 2.8015 53.70 1.60 7.0816 50.00 1.55 18.00
YXXX
1
β
L01B MGS 8110 - Regression - Calculations 11
Manual CalculationsMultivariate Case
(1 of 4)
Y
68.70 1 2.05 3.43 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
54.90 1 1.70 11.61 2.05 1.7 1.47 1.75 1.94 1.19 1.56 1.95 1.6 1.49 1.91 1.38 1.55 1.88 1.6 1.55
51.50 1 1.47 8.31 3.43 11.61 8.31 0 7.41 31.7 16.1 2.05 1.74 2.76 0 0 12.61 2.8 7.08 18
71.60 1 1.75 0.00
58.40 1 1.94 7.41
40.70 1 1.19 31.70
51.70 1 1.56 16.10
71.90 1 1.95 2.05
57.10 1 1.60 1.74
58.30 1 1.49 2.76
73.50 1 1.91 0.00
58.50 1 1.38 0.00
49.10 1 1.55 12.61
67.50 1 1.88 2.80
53.70 1 1.60 7.08
50.00 1 1.55 18.00 then Hi-light area for transposed array or Hold Shift & Control, Press Enter
Hi-light formula bar =TRANSPOSE(H2:J17) for the above pop-up menu
Hold Shift & Control, Press Enter instead of just pressing Enter for OK
X'X
L01B MGS 8110 - Regression - Calculations 12
Manual Calculations Multivariate Case
(2 of 4)
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1.00 2.05 3.432.05 1.7 1.47 1.75 1.94 1.19 1.56 1.95 1.6 1.49 1.91 1.38 1.55 1.88 1.6 1.55 1.00 1.70 11.63.43 11.61 8.31 0 7.41 31.7 16.1 2.05 1.74 2.76 0 0 12.61 2.8 7.08 18 1.00 1.47 8.31
1.00 1.75 01.00 1.94 7.411.00 1.19 31.7
16.0 26.6 125.6 1.00 1.56 16.126.6 45 191.1 1.00 1.95 2.05125.6 191.1 2090 1.00 1.60 1.74
1.00 1.49 2.761.00 1.91 01.00 1.38 01.00 1.55 12.61.00 1.88 2.81.00 1.60 7.081.00 1.55 18
X'X
XX'
Shift + Control then Enter
L01B MGS 8110 - Regression - Calculations 13
Manual Calculations Multivariate Case
(3 of 4)
16.00 26.57 125.60
26.57 44.96 191.13
125.60 191.13 2090.45
5.75146 -3.15652 -0.056962156
-3.15652 1.76875 0.027935964
-0.056962156 0.027935964 0.001346624
Verification1.0 1.5987E-14 -2.5580E-13
-8.8818E-15 1.0 1.2790E-13
-1.1102E-16 -2.7756E-16 1.0
X'X
(X'X)-1
Shift + Control then Enter
Shift + Control then Enter
L01B MGS 8110 - Regression - Calculations 14
Manual Calculations Multivariate Case
(4 of 4)
5.75146 -3.15652 -0.05696
-3.15652 1.76875 0.02794
-0.05696 0.02794 0.00135
X'Y
937.10
1583.52
6352.30
b=(X'X)-1(X'Y)
29.4198
20.3318
-0.5878
(X'X)-1
Shift + Control then Enter
L01B MGS 8110 - Regression - Calculations 15
Excel, Data Analysis Calculations Multivariate Case
House Price Size Age TOOLS / DATA ANALYSIS / Regression1 68.70 2.05 3.432 54.90 1.70 11.613 51.50 1.47 8.31 Regression Statistics4 71.60 1.75 0.005 58.40 1.94 7.416 40.70 1.19 31.707 51.70 1.56 16.108 71.90 1.95 2.059 57.10 1.60 1.7410 58.30 1.49 2.7611 73.50 1.91 0.0012 58.50 1.38 0.0013 49.10 1.55 12.6114 67.50 1.88 2.8015 53.70 1.60 7.0816 50.00 1.55 18.00
L01B MGS 8110 - Regression - Calculations 16
Excel, Data Analysis Calculations Multivariate Case
(continued)
SUMMARY OUTPUTRegression Statistics
Multiple R 0.914R Square 0.836Adjusted R Square0.810Standard Error 4.166Observations 16
ANOVAdf SS MS F Significance F
Regression 2 1146.2 573.1 33.0 0.0Residual 13 225.6 17.4Total 15 1371.8
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0%Upper 95.0%Intercept 29.42 9.99 2.94 0.011 7.84 51.00 7.84 51.00Size 20.33 5.54 3.67 0.003 8.36 32.30 8.36 32.30Age -0.59 0.15 -3.85 0.002 -0.92 -0.26 -0.92 -0.26
L01B MGS 8110 - Regression - Calculations 17
SPSS Data Analysis Calculations Multivariate Case
SPSS: Analyze/Regression/Linear/
L01B MGS 8110 - Regression - Calculations 18
SPSS Data Analysis Calculations Multivariate Case
(continued)
Model Summary
.914a .836 .810 4.1657Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), AGE, SIZEa.
ANOVAb
1146.245 2 573.123 33.027 .000a
225.589 13 17.353
1371.834 15
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), AGE, SIZEa.
Dependent Variable: PRICEb.
Coefficientsa
29.420 9.990 2.945 .011
20.332 5.540 .503 3.670 .003
-.588 .153 -.527 -3.845 .002
(Constant)
SIZE
AGE
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: PRICEa.
L01B MGS 8110 - Regression - Calculations 19
Test for Multicollinearity by Correlation Analysis in Excel
High correlation between dependent variable and the independent variables is desirable.
High correlation between the independent variables is an undesirable. A potential multicollinearity condition.
Excel: TOOLS / DATA ANALYSIS / Correlation
House Price Size Age Price Size Age1 68.70 2.05 3.43 Price 12 54.90 1.70 11.61 Size 0.805 13 51.50 1.47 8.31 Age -0.816 -0.572 14 71.60 1.75 0.005 58.40 1.94 7.41 TOOLS / DATA ANALYSIS / Correlation6 40.70 1.19 31.707 51.70 1.56 16.108 71.90 1.95 2.059 57.10 1.60 1.7410 58.30 1.49 2.7611 73.50 1.91 0.0012 58.50 1.38 0.0013 49.10 1.55 12.6114 67.50 1.88 2.8015 53.70 1.60 7.0816 50.00 1.55 18.00
L01B MGS 8110 - Regression - Calculations 20
Test for Multicollinearity by Correlation Analysis in SPSS
High correlation between dependent variable and the independent variables is desirable.
High correlation between the independent variables is an undesirable, multicollinearity condition.
SPSS: Analysis / Correlate / Bivariate
Correlations
1 .805** -.816**
. .000 .000
16 16 16
.805** 1 -.572*
.000 . .020
16 16 16
-.816** -.572* 1
.000 .020 .
16 16 16
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
PRICE
SIZE
AGE
PRICE SIZE AGE
Correlation is significant at the 0.01 level (2-tailed).**.
Correlation is significant at the 0.05 level (2-tailed).*.
L01B MGS 8110 - Regression - Calculations 21
How large will a correlation be when there is a multicollinearity condition?
Skip says: > .98 may be a problem.
Textbook says: > .90 may be a problem.
L01B MGS 8110 - Regression - Calculations 22
Test for Multicollinearity by VIF in SPSS
SPSS: Analysis / Regression / Linear
Coefficients St Coef t Sig. Collinearity StatisticsB Std. Error Beta Tolerance VIF
(Constant) 29.420 9.99 2.94 0.011SIZE 20.332 5.54 0.503 3.67 0.003 0.67 1.49AGE -0.588 0.15 -0.527 -3.85 0.002 0.67 1.49
Potential multicollinearity:– If largest Rj
2 > .9– If largest VIFj > 10– If Mean VIF >>> 1
L01B MGS 8110 - Regression - Calculations 23
Regression Age on SizeRegression Statistics VIF
Multiple R 0.572 1.49R Square 0.328 =1/(1-H6)
Adjusted R Square 0.280Standard Error 0.201Observations 16
ANOVAdf SS MS F Sig F
Regression 1 0.28 0.28 6.82 0.020Residual 14 0.57 0.04Total 15 0.84
CoefficientsStandard Error t Stat P-value Low 95% Hi 95%Intercept 1.78 0.07 25.82 0.000 1.64 1.93Age -0.02 0.01 -2.61 0.020 -0.03 0.00
Regression Size on AgeRegression Statistics VIF
Multiple R 0.572 1.49R Square 0.328 =1/(1-H25)
Adjusted R Square 0.280Standard Error 7.283Observations 16
ANOVAdf SS MS F Sig F
Regression 1 361.89 361.89 6.82 0.020Residual 14 742.60 53.04Total 15 1104.49
CoefficientsStandard Error t Stat P-value Low 95% Hi 95%Intercept 42.30 13.31 3.18 0.007 13.74 70.86Size -20.75 7.94 -2.61 0.020 -37.78 -3.71
Verification of Calculated VIF Values
Coefficients St Coef t Sig. Collinearity StatisticsB Std. Error Beta Tolerance VIF
(Constant) 29.420 9.99 2.94 0.011SIZE 20.332 5.54 0.503 3.67 0.003 0.67 1.49AGE -0.588 0.15 -0.527 -3.85 0.002 0.67 1.49
2
|1
1
SizeAge
AGE RVIF
L01B MGS 8110 - Regression - Calculations 24
Calculated VIF Values if only Excel is Available
ptoifor
RVIF
RVIF
pIDiIDiIDIDIDiIDiID
pIDIDIDIDID
1
1
1 termsgeneral moreIn
1
1
2#),....,1(#),1(#,...2#,1#|#
#
2#,...,3#,2#|1#
1#
The R2 for each of the independent variables versus all of the remaining independent variables is needed to calculate the VIF’s. That is, “p” linear regression would need to be calculated. There is a useful trick that can be used to avoid doing the “p” regressions. The procedure is described in the next slides.
L01B MGS 8110 - Regression - Calculations 25
Calculated the R2 for each of the independent variables.
.matrixn correlatio p p theis C where/1
ψ.matrix theof diagonal on the elements theare s'R required The
1diag
2
CI
L01B MGS 8110 - Regression - Calculations 26
Calculated the R2 (continued) .matrixn correlatio p p theis C where/1 1diag CI
These are the desired R2 ‘s
EXAMPLE DATAheight shldr pelvic chest thigh179.6 41.7 27.3 82.4 19175.6 37.5 29.1 84.1 5.5166.2 39.4 26.8 88.1 22173.8 41.2 27.6 97.6 19.5184.8 39.8 26.1 88.2 14.5189.1 43.3 30.1 101.2 22191.5 42.8 28.4 91 18180.2 41.6 27.3 90.4 5.5183.8 42.3 30.1 100.2 13.5163.1 37.2 24.2 80.5 7169.6 39.4 27.2 92 16.5171.6 39.1 27 86.2 25.5180 40.8 28.3 87.4 17.5
174.6 39.8 25.9 83.9 16.5181.8 40.6 29.5 95.1 32167.4 39.7 26.4 86 13173 41.2 26.9 96.1 11.5
179.8 40 29.8 100.9 15176.8 41.2 28.4 100.8 20.5179.3 41.4 31.6 90.1 9.5193.5 41.6 29.2 95.7 21178.8 39.3 27.1 83 16.5179.6 43.8 30.1 100.8 22172.6 40.9 27.3 91.5 22171.5 40.4 27.8 87.7 15.5168.9 39.8 26.7 83.9 6183.1 43.2 28.3 95.7 11163.6 37.5 26.6 84 15.5184.3 40.3 29 93.2 8.5181 42.8 29.7 90.3 8.5
180.2 41.4 28.7 88.1 13.5184.1 42 28.9 81.3 14178.9 42.5 28.7 95 16170 39.7 27.7 93.6 15
180.6 42.1 27.3 89.5 16179 40.8 28.2 90.3 26.5
186.6 42.5 31.5 100.3 27181.4 41.9 28.9 96.6 25.5176.5 40.7 29.1 86.5 20.5174 40.9 27 88.1 18
178.2 42.9 27.2 100.3 16.5177.1 39.4 27.6 85.5 16180 40.9 28.7 86.1 15
176.8 41.3 28.2 92.7 12.5176.3 39 26 83.3 7192.4 43.7 28.7 96.1 20.5175.2 39.4 27.3 90.8 19175.9 43.4 29.3 90.7 18174.6 42.3 29.2 82.6 3.5179 41.2 27.3 85.6 16
C =Correlation Tableheight shldr pelvic chest thigh
height 1.000 0.654 0.586 0.426 0.223shldr 0.654 1.000 0.582 0.554 0.205
pelvic 0.586 0.582 1.000 0.522 0.207chest 0.426 0.554 0.522 1.000 0.398thigh 0.223 0.205 0.207 0.398 1.000
C inverse = MINVERSE(B9:F13)height shldr pelvic chest thigh
height 1.984 -0.942 -0.612 0.055 -0.145shldr -0.942 2.189 -0.411 -0.639 0.102
pelvic -0.612 -0.411 1.846 -0.488 0.032chest 0.055 -0.639 -0.488 1.780 -0.488thigh -0.145 0.102 0.032 -0.488 1.199
Term by term reciprocal of C inverseheight shldr pelvic chest thigh
height 0.504 -1.061 -1.634 18.325 -6.907shldr -1.061 0.457 -2.436 -1.566 9.831
pelvic -1.634 -2.436 0.542 -2.050 31.567chest 18.325 -1.566 -2.050 0.562 -2.048thigh -6.907 9.831 31.567 -2.048 0.834
= R2
height shldr pelvic chest thigh
height 0.496shldr 0.543
pelvic 0.458chest 0.438thigh 0.166
L01B MGS 8110 - Regression - Calculations 27
Calculated the R2 (continued)
CONCLUSIONThe VIFs are the diagonals of the C-Inverse matrix (see previous slide).
ptoifor
RVIF
pIDiIDiIDIDIDiIDiID
1
1
12
#),....,1(#),1(#,...2#,1#|##
= R2
height shldr pelvic chest thigh
height 0.496shldr 0.543
pelvic 0.458chest 0.438thigh 0.166
= 1-R2
height shldr pelvic chest thigh
height 0.504shldr 0.457
pelvic 0.542chest 0.562thigh 0.834
VIF = = 1/(1-R2)height shldr pelvic chest thigh
height 1.984shldr 2.189
pelvic 1.846chest 1.780thigh 1.199
L01B MGS 8110 - Regression - Calculations 28
Verification of the R2 Calculations Individual regression
fits.R-square 0.496
Constant 56.77shldr 2.01
pelvic 1.44chest -0.0309thigh 0.0801
R-square 0.543Constant 10.18
height 0.102pelvic 0.207chest 0.0775thigh -0.0121
R-square 0.458Constant 1.504
height 0.071
shldr 0.202chest 0.0636thigh -0.0040
R-square 0.438Constant 3.83
height -0.0273shldr 1.351
pelvic 1.139
thigh 0.268
R-square 0.166Constant -24.616
height 0.110shldr -0.326
pelvic -0.112chest 0.416
C =Correlation Tableheight shldr pelvic chest thigh
height 1.000 0.654 0.586 0.426 0.223shldr 0.654 1.000 0.582 0.554 0.205
pelvic 0.586 0.582 1.000 0.522 0.207chest 0.426 0.554 0.522 1.000 0.398thigh 0.223 0.205 0.207 0.398 1.000
C inverse = MINVERSE(B9:F13)height shldr pelvic chest thigh
height 1.984 -0.942 -0.612 0.055 -0.145shldr -0.942 2.189 -0.411 -0.639 0.102
pelvic -0.612 -0.411 1.846 -0.488 0.032chest 0.055 -0.639 -0.488 1.780 -0.488thigh -0.145 0.102 0.032 -0.488 1.199
Term by term reciprocal of C inverseheight shldr pelvic chest thigh
height 0.504 -1.061 -1.634 18.325 -6.907shldr -1.061 0.457 -2.436 -1.566 9.831
pelvic -1.634 -2.436 0.542 -2.050 31.567chest 18.325 -1.566 -2.050 0.562 -2.048thigh -6.907 9.831 31.567 -2.048 0.834
= R2
height shldr pelvic chest thigh
height 0.496shldr 0.543
pelvic 0.458chest 0.438thigh 0.166
L01B MGS 8110 - Regression - Calculations 29
= R2
height shldr pelvic chest thigh
height 0.496shldr 0.543
pelvic 0.458chest 0.438thigh 0.166
= 1-R2
height shldr pelvic chest thigh
height 0.504shldr 0.457
pelvic 0.542chest 0.562thigh 0.834
VIF = = 1/(1-R2)height shldr pelvic chest thigh
height 1.984shldr 2.189
pelvic 1.846chest 1.780thigh 1.199
Unstandardized Coefficients Standardized Coefficientst Sig.B Std. Error Beta VIF
(Constant) -146.61 13.96 -10.50 0.00height 0.352 0.101 0.206 3.500 0.001 1.98shldr 0.590 0.446 0.082 1.322 0.193 2.19pelvic 0.927 0.452 0.116 2.049 0.046 1.85chest 1.162 0.107 0.605 10.874 0.000 1.78
thigh 0.426 0.086 0.227 4.967 0.000 1.20Dependent Variable: weight
Verification of the R2
Calculations (continued)
From SPSS output.
top related