15103903 multiple regression (1)
TRANSCRIPT
-
7/31/2019 15103903 Multiple Regression (1)
1/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-1
Multiple Regression Analysisand Model Building
-
7/31/2019 15103903 Multiple Regression (1)
2/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-2
Chapter Goals
After completing this chapter, you should be ableto:
explain model building using multiple regression
analysis apply multiple regression analysis to business
decision-making situations
analyze and interpret the computer output for amultiple regression model
test the significance of the independent variables ina multiple regression model
-
7/31/2019 15103903 Multiple Regression (1)
3/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-3
Chapter Goals
After completing this chapter, you should beable to:
recognize potential problems in multipleregression analysis and take steps to correct theproblems
incorporate qualitative variables into the
regression model by using dummy variables use variable transformations to model nonlinear
relationships
(continued)
-
7/31/2019 15103903 Multiple Regression (1)
4/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-4
The Multiple Regression Model
Idea: Examine the linear relationship between1 dependent (y) & 2 or more independent variables (xi)
xxxy kk22110 +++++=
kk22110 xbxbxbby ++++=
Population model:
Y-intercept Population slopes Random Error
Estimated(or predicted)value of y
Estimated slope coefficients
Estimated multiple regression model:
Estimatedintercept
-
7/31/2019 15103903 Multiple Regression (1)
5/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-5
Multiple Regression Model
Two variable model
y
x1
x2
22110 xbxbby ++=
Slopef
orvariabl
ex1
Slopeforvar
iablex2
-
7/31/2019 15103903 Multiple Regression (1)
6/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-6
Multiple Regression Model
Two variable model
y
x1
x2
22110 xbxbby ++=yi
yi
-
7/31/2019 15103903 Multiple Regression (1)
7/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-7
Multiple Regression Assumptions
The model errors are independent andrandom
The errors are normally distributed
The mean of the errors is zero Errors have a constant variance
e = (y y)
-
7/31/2019 15103903 Multiple Regression (1)
8/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-8
Model Specification
Decide what you want to do and select thedependent variable
Determine the potential independent variables foryour model
Gather sample data (observations) for all variables
-
7/31/2019 15103903 Multiple Regression (1)
9/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-9
The Correlation Matrix
Correlation between the dependent variable andselected independent variables can be found usingExcel: Formula Tab: Data Analysis / Correlation
Can check for statistical significance of correlationwith a t test
-
7/31/2019 15103903 Multiple Regression (1)
10/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-10
Example
A distributor of frozen desert pies wants toevaluate factors thought to influence demand
Dependent variable: Pie sales (units per week)
Independent variables: Price (in $)
Advertising ($100s)
Data are collected for 15 weeks
-
7/31/2019 15103903 Multiple Regression (1)
11/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-11
Pie Sales Model
Sales = b0 + b1 (Price)
+ b2 (Advertising)
Week Pie Sales Price($) Advertising($100s)
1 350 5.50 3.3
2 460 7.50 3.3
3 350 8.00 3.0
4 430 8.00 4.5
5 350 6.80 3.06 380 7.50 4.0
7 430 4.50 3.0
8 470 6.40 3.7
9 450 7.00 3.5
10 490 5.00 4.0
11 340 7.20 3.5
12 300 7.90 3.2
13 440 5.90 4.0
14 450 5.00 3.5
15 300 7.00 2.7
Pie Sales Price Advertising
Pie Sales 1
Price -0.44327 1
Advertising 0.55632 0.03044 1
Correlation matrix:
Multiple regression model:
-
7/31/2019 15103903 Multiple Regression (1)
12/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-12
Interpretation of EstimatedCoefficients
Slope (bi)
Estimates that the average value of y changes by bi
units for each 1 unit increase in Xi holding all other
variables constant Example: if b1 = -20, then sales (y) is expected to
decrease by an estimated 20 pies per week for each $1increase in selling price (x1), net of the effects of
changes due to advertising (x2) y-intercept (b0)
The estimated average value of y when all xi = 0
(assuming all xi = 0 is within the range of observed
values)
-
7/31/2019 15103903 Multiple Regression (1)
13/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-13
Pie Sales Correlation Matrix
Price vs. Sales : r = -0.44327 There is a negative association between
price and sales
Advertising vs. Sales : r = 0.55632 There is a positive association between
advertising and sales
Pie Sales Price Advertising
Pie Sales 1
Price -0.44327 1
Advertising 0.55632 0.03044 1
-
7/31/2019 15103903 Multiple Regression (1)
14/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-14
Scatter Diagrams
Sales vs. Price
0
100
200
300
400
500
600
0 2 4 6 8 10
Sales vs. Advertising
0
100
200
300
400
500
600
0 1 2 3 4 5
Sales
Sales
Price
Advertising
-
7/31/2019 15103903 Multiple Regression (1)
15/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-15
Estimating a Multiple LinearRegression Equation
Computer software is generally used to generatethe coefficients and measures of goodness of fitfor multiple regression
Excel: Data / Data Analysis / Regression
PHStat: Add-Ins / PHStat / Regression / Multiple Regression
-
7/31/2019 15103903 Multiple Regression (1)
16/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-16
Estimating a Multiple LinearRegression Equation
Excel: Data / Data Analysis / Regression
-
7/31/2019 15103903 Multiple Regression (1)
17/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-17
Estimating a Multiple LinearRegression Equation
PHStat: Add-Ins / PHStat / Regression / Multiple Regression
-
7/31/2019 15103903 Multiple Regression (1)
18/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-18
Multiple Regression Output
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
ertising)74.131(Advce)24.975(Pri-306.526Sales +=
-
7/31/2019 15103903 Multiple Regression (1)
19/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-19
The Multiple Regression Equation
ertising)74.131(Advce)24.975(Pri-306.526Sales +=
b1 = -24.975: sales
will decrease, onaverage, by 24.975
pies per week foreach $1 increase inselling price, net ofthe effects of changesdue to advertising
b2 = 74.131: sales will
increase, on average,by 74.131 pies per
week for each $100increase inadvertising, net of theeffects of changesdue to price
whereSales is in number of pies per week
Price is in $Advertising is in $100s.
-
7/31/2019 15103903 Multiple Regression (1)
20/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-20
Using The Model to MakePredictions
Predict sales for a week in which the sellingprice is $5.50 and advertising is $350:
Predicted salesis 428.62 pies
428.62
(3.5)74.131(5.50)24.975-306.526
ertising)74.131(Advce)24.975(Pri-306.526Sales
=
+=
+=
Note that Advertising isin $100s, so $350means that x2 = 3.5
-
7/31/2019 15103903 Multiple Regression (1)
21/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-21
Predictions in PHStat
PHStat | regression | multiple regression
Check theconfidence andprediction intervalestimates box
-
7/31/2019 15103903 Multiple Regression (1)
22/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-22
Input values
Predictions in PHStat(continued)
Predicted y value
-
7/31/2019 15103903 Multiple Regression (1)
23/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-23
Multiple Coefficient ofDetermination (R2)
Reports the proportion of total variation in yexplained by all x variables taken together
squaresofsumTotal
regressionsquaresofSum
SST
SSRR2 ==
-
7/31/2019 15103903 Multiple Regression (1)
24/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-24
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
.5214856493.3
29460.0
SST
SSRR2 ===
52.1% of the variation in pie sales
is explained by the variation inprice and advertising
Multiple Coefficient ofDetermination
(continued)
-
7/31/2019 15103903 Multiple Regression (1)
25/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-25
Adjusted R2
R2 never decreases when a new x variable isadded to the model This can be a disadvantage when comparing
models What is the net effect of adding a new variable?
We lose a degree of freedom when a new xvariable is added
Did the new x variable add enough explanatorypower to offset the loss of one degree offreedom?
-
7/31/2019 15103903 Multiple Regression (1)
26/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-26
Shows the proportion of variation in y explained byall x variables adjusted for the number of xvariablesused
(where n = sample size, k = number of independent variables)
Penalize excessive use of unimportant independentvariables
Smaller than R2
Useful in comparing among models
Adjusted R2
(continued)
=
1kn1n)R1(1R 22A
-
7/31/2019 15103903 Multiple Regression (1)
27/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-27
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
.44172R2A =44.2% of the variation in pie sales isexplained by the variation in price and
advertising, taking into account the samplesize and number of independent variables
Multiple Coefficient ofDetermination
(continued)
-
7/31/2019 15103903 Multiple Regression (1)
28/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-28
Is the Model Significant?
F-Test for Overall Significance of the Model
Shows if there is a linear relationship between all of
the x variables considered together and y
Use F test statistic
Hypotheses:
H0:
1=
2= =
k= 0 (no linear relationship)
HA: at least one i 0 (at least one independentvariable affects y)
-
7/31/2019 15103903 Multiple Regression (1)
29/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-29
F-Test for Overall Significance
Test statistic:
where F has (numerator) D1 = k and
(denominator) D2 = (n k 1)
degrees of freedom
(continued)
MSE
MSR
1kn
SSEk
SSR
F =
=
-
7/31/2019 15103903 Multiple Regression (1)
30/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-30
6.53862252.8
14730.0
MSE
MSRF ===
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
(continued)
F-Test for Overall Significance
With 2 and 12 degreesof freedom P-value forthe F-Test
-
7/31/2019 15103903 Multiple Regression (1)
31/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-31
H0: 1 = 2 = 0
HA: 1 and 2 not both zero
= .05
df1= 2 df2 = 12
Test Statistic:
Decision:
Conclusion:
Reject H0 at = 0.05
The regression model does explaina significant portion of thevariation in pie sales
(There is evidence that at least oneindependent variable affects y )
0 = .05
F.05
= 3.885Reject H0Do not
reject H0
6.5386MSE
MSRF ==
CriticalValue:
F
= 3.885
F-Test for Overall Significance(continued)
F
A I di id l V i bl
-
7/31/2019 15103903 Multiple Regression (1)
32/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-32
Are Individual VariablesSignificant?
Use t-tests of individual variable slopes
Shows if there is a linear relationship between the
variable xi and y Hypotheses:
H0: i = 0 (no linear relationship)
HA: i 0 (linear relationship does existbetween xi and y)
A I di id l V i bl
-
7/31/2019 15103903 Multiple Regression (1)
33/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-33
Are Individual VariablesSignificant?
H0: i = 0 (no linear relationship)
HA: i 0 (linear relationship does exist
between xi and y )
Test Statistic:
(df = n k 1)
ib
i
s0bt =
(continued)
A I di id l V i bl
-
7/31/2019 15103903 Multiple Regression (1)
34/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-34
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
t-value for Price is t = -2.306, withp-value .0398
t-value for Advertising is t = 2.855,
with p-value .0145
(continued)
Are Individual VariablesSignificant?
I f b t th Sl
-
7/31/2019 15103903 Multiple Regression (1)
35/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-35
d.f. = 15-2-1 = 12
= .05
t /2 = 2.1788
Inferences about the Slope:tTest Example
H0: i = 0
HA: i 0
The test statistic for each variable falls inthe rejection region (p-values < .05)
There is evidence that bothPrice and Advertising affectpie sales at = .05
From Excel output:
Reject H0 for each variable
Coefficients Standard Error t Stat P-value
Price -24.97509 10.83213 -2.30565 0.03979
Advertising 74.13096 25.96732 2.85478 0.01449
Decision:
Conclusion:Reject H0Reject H0
/2=.025
-t/2Do not reject H0
0t/2
/2=.025
-2.1788 2.1788
C fid I t l E ti t
-
7/31/2019 15103903 Multiple Regression (1)
36/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-36
Confidence Interval Estimatefor the Slope
Confidence interval for the population slope 1(the effect of changes in price on pie sales):
Example: Weekly sales are estimated to be reduced bybetween 1.37 to 48.58 pies for each increase of $1 in theselling price
ib2/istb
Coefficients Standard Error Lower 95% Upper 95%
Intercept 306.52619 114.25389 57.58835 555.46404
Price -24.97509 10.83213 -48.57626 -1.37392
Advertising 74.13096 25.96732 17.55303 130.70888
where t has(n k 1) d.f.
St d d D i ti f th
-
7/31/2019 15103903 Multiple Regression (1)
37/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-37
Standard Deviation of theRegression Model
The estimate of the standard deviation of theregression model is:
MSEkn
SSEs =
=
1
Is this value large or small? Must compare to themean size of y for comparison
St d d D i ti f th
-
7/31/2019 15103903 Multiple Regression (1)
38/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-38
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
The standard deviation of theregression model is 47.46
(continued)
Standard Deviation of theRegression Model
St d d D i ti f th
-
7/31/2019 15103903 Multiple Regression (1)
39/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-39
The standard deviation of the regression model is47.46
A rough prediction range for pie sales in a given
week is Pie sales in the sample were in the 300 to 500 per
week range, so this range is probably too large to
be acceptable. The analyst may want to look foradditional variables that can explain more of thevariation in weekly sales
(continued)
Standard Deviation of theRegression Model
94.22(47.46) =
-
7/31/2019 15103903 Multiple Regression (1)
40/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-40
Multicollinearity
Multicollinearity: High correlation exists
between two independent variables
This means the two variables contributeredundant information to the multiple regression
model
-
7/31/2019 15103903 Multiple Regression (1)
41/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-41
Multicollinearity
Including two highly correlated independent
variables can adversely affect the regression
results
No new information provided
Can lead to unstable coefficients (large
standard error and low t-values)
Coefficient signs may not match prior
expectations
(continued)
S I di ti f S
-
7/31/2019 15103903 Multiple Regression (1)
42/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-42
Some Indications of SevereMulticollinearity
Incorrect signs on the coefficients Large change in the value of a previous
coefficient when a new variable is added to the
model A previously significant variable becomes
insignificant when a new independent variableis added
The estimate of the standard deviation of themodel increases when a variable is added tothe model
-
7/31/2019 15103903 Multiple Regression (1)
43/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-43
Qualitative (Dummy) Variables
Categorical explanatory variable (dummyvariable) with two or more levels: yes or no, on or off, male or female
coded as 0 or 1 Regression intercepts are different if the
variable is significant Assumes equal slopes for other variables The number of dummy variables needed is
(number of levels 1)
Dummy Variable Model Example
-
7/31/2019 15103903 Multiple Regression (1)
44/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-44
Dummy-Variable Model Example(with 2 Levels)
Let:
y = pie sales
x1 = price
x2 = holiday (X2 = 1 if a holiday occurred during the week)(X2 = 0 if there was no holiday that week)
210 xbxbby 21 ++=
Dummy Variable Model Example
-
7/31/2019 15103903 Multiple Regression (1)
45/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-45
Sameslope
Dummy-Variable Model Example(with 2 Levels)
(continued)
x1 (Price)
y (sales)
b0 + b2
b0
1010
12010
xbb(0)bxbby
xb)b(b(1)bxbby
121
121
+=++=
++=++= Holiday
No Holiday
Differentintercept
Holiday
NoHoliday
If H0: 2 = 0 is
rejected, then
Holiday has asignificant effecton pie sales
Interpreting the Dummy Variable
-
7/31/2019 15103903 Multiple Regression (1)
46/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-46
Sales: number of pies sold per week
Price: pie price in $
Holiday:
Interpreting the Dummy VariableCoefficient (with 2 Levels)
Example:
1 If a holiday occurred during the week
0 If no holiday occurred
b2 = 15: on average, sales were 15 pies greater inweeks with a holiday than in weeks without aholiday, given the same price
)15(Holiday30(Price)-300Sales +=
Dummy Variable Models
-
7/31/2019 15103903 Multiple Regression (1)
47/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-47
Dummy-Variable Models(more than 2 Levels)
The number of dummy variables is one less thanthe number of levels
Example:
y = house price ; x1 = square feet
The style of the house is also thought to matter:
Style = ranch, split level, condo
Three levels, so two dummyvariables are needed
Dummy Variable Models
-
7/31/2019 15103903 Multiple Regression (1)
48/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-48
Dummy-Variable Models(more than 2 Levels)
=
=
notif0
levelsplitif1x
notif0
ranchif1x 32
3210 xbxbxbby 321 +++=
b2 shows the impact on price if the house is aranch style, compared to a condo
b3 shows the impact on price if the house is a split
level style, compared to a condo
(continued)
Let the default category be condo
Interpreting the Dummy Variable
-
7/31/2019 15103903 Multiple Regression (1)
49/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-49
Interpreting the Dummy VariableCoefficients (with 3 Levels)
With the same square feet, aranch will have an estimatedaverage price of 23.53thousand dollars more than acondo
With the same square feet, aranch will have an estimatedaverage price of 18.84thousand dollars more than acondo.
Suppose the estimated equation is
321 18.84x23.53x0.045x20.43y +++=
18.840.045x20.43y 1 ++=
23.530.045x20.43y1
++=
10.045x20.43y +=For a condo: x2 = x3 = 0
For a ranch: x3 = 0
For a split level: x2 = 0
-
7/31/2019 15103903 Multiple Regression (1)
50/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-50
Model Building
Goal is to develop a model with the best set ofindependent variables Easier to interpret if unimportant variables are removed Lower probability of collinearity
Stepwise regression procedure Provide evaluation of alternative models as variables
are added
Best-subset approach Try all combinations and select the best using the
highest adjusted R2 and lowest s
-
7/31/2019 15103903 Multiple Regression (1)
51/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-51
Idea: develop the least squaresregression equation in steps, either
through forward selection, backwardelimination, or through standard stepwiseregression
Stepwise Regression
-
7/31/2019 15103903 Multiple Regression (1)
52/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-52
Best Subsets Regression
Idea: estimate all possible regression equationsusing all possible combinations of independentvariables
Choose the best fit by looking for the highestadjusted R2 and lowest standard error s
Stepwise regression and best subsetsregression can be performed using PHStat,
Minitab, or other statistical software packages
-
7/31/2019 15103903 Multiple Regression (1)
53/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-53
Aptness of the Model
Diagnostic checks on the model includeverifying the assumptions of multipleregression:
Errors are independent and random Error are normally distributed Errors have constant variance
Each xi is linearly related to y
)yy(ei =Errors (or Residuals) are given by
-
7/31/2019 15103903 Multiple Regression (1)
54/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-54
Residual Analysis
Non-constant variance Constant variance
x x
residuals
residuals
Not Independent Independent
x
residua
ls
x
residua
ls
-
7/31/2019 15103903 Multiple Regression (1)
55/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-55
The Normality Assumption
Errors are assumed to be normally distributed
Standardized residuals can be calculated bycomputer
Examine a histogram or a normal probability plot ofthe standardized residuals to check for normality
-
7/31/2019 15103903 Multiple Regression (1)
56/57
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 15-56
Chapter Summary
Developed the multiple regression model Tested the significance of the multiple
regression model Developed adjusted R2
Tested individual regression coefficients Used dummy variables
-
7/31/2019 15103903 Multiple Regression (1)
57/57
Chapter Summary
Described multicollinearity Discussed model building
Stepwise regression Best subsets regression
Examined residual plots to check modelassumptions
(continued)