multiple regression (1)
TRANSCRIPT
![Page 1: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/1.jpg)
Slide 1
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Shakeel NoumanM.Phil Statistics
Multiple Regression (1)
![Page 2: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/2.jpg)
Slide 2
• Using Statistics• The k-Variable Multiple Regression Model• The F Test of a Multiple Regression Model• How Good is the Regression• Tests of the Significance of Individual
Regression Parameters• Testing the Validity of the Regression
Model• Using the Multiple Regression Model for
Prediction
Multiple Regression (1)11
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 3: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/3.jpg)
Slide 3
• Qualitative Independent Variables• Polynomial Regression• Nonlinear Models and Transformations• Multicollinearity• Residual Autocorrelation and the Durbin-
Watson Test• Partial F Tests and Variable Selection
Methods• The Matrix Approach to Multiple
Regression Analysis• Summary and Review of Terms
Multiple Regression (2)11
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 4: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/4.jpg)
Slide 4
Slope: 1
Intercept: 0
Any two points (A and B), or an intercept and slope (0 and
1), define a line on a two-dimensional surface.
B
A
x
y
x2
x1
y
C
A
B
Any three points (A, B, and C), or an intercept and coefficients of x1 and x2 (0 , 1, and 2), define a plane in a
three-dimensional surface.
Lines Planes
11-1 Using Statistics
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 5: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/5.jpg)
Slide 5
y x x 0 1 1 2 2
The population regression model of a dependent variable, Y, on a set of k independent variables, X1, X2,. . . , Xk is given by:
Y= 0 + 1X1 + 2X2 + . . . + kXk +
where 0 is the Y-intercept of the regression surface and each i , i = 1,2,...,k is the slope of the regression surface - sometimes called the response surface - with respect to Xi.
x2
x1
y 2
10
Model assumptions:1. ~N(0,2), independent of other errors.2. The variables Xi are uncorrelated with the error term.
11-2 The k-Variable Multiple Regression Model
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 6: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/6.jpg)
Slide 6
In a simple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression line.
In a multiple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression plane.
X
Y
x2
x1
y
y b b x 0 1y b b x b x 0 1 1 2 2
Simple and Multiple Least-Squares Regression
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 7: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/7.jpg)
Slide 7
The estimated regression relationship:
where is the predicted value of Y, the value lying on the estimated regression surface. The terms b0,...,k are the least-squares estimates of the population regression parameters i.
Y b b X b X b Xk k 0 1 1 2 2
Y
The actual, observed value of Y is the predicted value plus an error:
yj = b0+ b1 x1j+ b2 x2j+. . . + bk xkj+e
The Estimated Regression Relationship
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 8: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/8.jpg)
Slide 8
2
22211202
212
2
11101
22110
xbxxbxbyx
xxbxbxbyx
xbxbnby
Minimizing the sum of squared errors with respect to the estimated coefficients b0, b1, and b2 yields the following
normal equations:
Least-Squares Estimation: The 2-Variable Normal Equations
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 9: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/9.jpg)
Slide 9
Y X1 X2 X1X2 X12 X2
2 X1Y X2Y 72 12 5 60 144 25 864 360 76 11 8 88 121 64 836 608 78 15 6 90 225 36 1170 468 70 10 5 50 100 25 700 350 68 11 3 33 121 9 748 204 80 16 9 144 256 81 1280 720 82 14 12 168 196 144 1148 984 65 8 4 32 64 16 520 260 62 8 3 24 64 9 496 186 90 18 10 180 324 100 1620 900--- --- --- --- ---- --- ---- ----
743 123 65 869 1615 509 9382 5040
Normal Equations:
743 = 10b0+123b1+65b2
9382 = 123b0+1615b1+869b2
5040 = 65b0+869b1+509b2
b0 = 47.164942b1 = 1.5990404b2 = 1.1487479
Estimated regression equation:
. . .Y X X 47164942 15990404 114874791 2
Example 11-1
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 10: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/10.jpg)
Slide 10Example 11-1: Using the
Template
Regression results for Alka-Seltzer sales
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 11: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/11.jpg)
Slide 11
Total Deviation = Regression Deviation + Error Deviation SST = SSR + SSE
x2
x1
y
y
Y Y : Error Deviation
Y Y : Regression DeviationTotal deviation: Y Y
Decomposition of the Total Deviation in a Multiple
Regression Model
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 12: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/12.jpg)
Slide 12
A statistical test for the existence of a linear relationship between Y and any or all of the independent variables X1, x2, ..., Xk:
H0: 1 = 2 = ...= k=0H1: Not all the i (i=1,2,...,k) are 0
Source of Variation
Sum of Squares
Degrees of Freedom
Mean Square
F Ratio
Regression SSR k
Error SSE n - (k+1)
Total SST n-1
MSRSSR
k
MSESSE
n k
( ( ))1
MSTSST
n
( )1
11-3 The F Test of a Multiple Regression Model
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 13: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/13.jpg)
Slide 13
The test statistic, F = 86.34, is greater than the critical point of F(2, 7) for any
common level of significance(p-value 0), so the null hypothesis is rejected, and we might conclude
that the dependent variable is related to one or more of the independent
variables.0F
F Distribution with 2 and 7 Degrees of Freedom
F0.01=9.55
=0.01
Test statistic 86.34f(F)
Using the Template: Analysis of Variance Table (Example 11-1)
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 14: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/14.jpg)
Slide 14
The multiple coefficient of determination, R2 , measures the proportion ofthe variation in the dependent variable that is explained by the combinationof the independent variables in the multiple regression model:
= SSRSST = 1- SSE
SST R2
The is an unbiasedestimator of the variance of the populationerrors, denoted by 2
:
=
mean square error
Standard error of estimate
, :
( ( ))( )
( ( ))MSE
SSE
n ky y
n k
s MSE
1
2
1
x2
x1
y
Errors: y - y
11-4 How Good is the Regression
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 15: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/15.jpg)
Slide 15
The , R 2 , is the coefficient ofdetermination with the SSE and SST divided by their respective degrees of freedom:
= 1 -
SSE
(n - (k + 1))
SST
(n - 1)
adjusted multiple coefficient of determination
R 2
SST
SSESSR
=SSR
SST= 1 -
SSE
SSTR2
Example 11-1: s = 1.911 R-sq = 96.1% R-sq(adj) = 95.0%
Decomposition of the Sum of Squares and the Adjusted
Coefficient of Determination
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 16: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/16.jpg)
Slide 16
Source ofVariation
Sum ofSquares
Degrees ofFreedom Mean Square F Ratio
Regression SSR (k)
Error SSE (n-(k+1))=(n-k-1)
Total SST (n-1)
MSRSSR
k
MSE SSEn k
( ( ))1
MSTSST
n
( )1
FMSR
MSE
=SSR
SST= 1 -
SSE
SSTR
2 = 1 -
SSE
(n - (k + 1))
SST
(n - 1)
=MSE
MSTR
2FR
R
n k
k
2
12
1
( )
( ( ))
( )
Measures of Performance in Multiple Regression and the ANOVA Table
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 17: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/17.jpg)
Slide 17
Hypothesis tests about individual regression slope parameters:
(1) H0: b1= 0H1: b1 0
(2) H0: b2 = 0H1: b2 0 . . .
(k) H0: bk = 0H1: bk 0
Test statistic for test i t bs bn k
i
i
:( )( ( )
1
0
11-5 Tests of the Significance of Individual Regression Parameters
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 18: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/18.jpg)
Slide 18
VariableCoefficientEstimate
StandardError t-Statistic
Constant 53.12 5.43 9.783 *X1 2.03 0.22 9.227 *X2 5.60 1.30 4.308 *X3 10.35 6.88 1.504
X4 3.45 2.70 1.259
X5 -4.25 0.38 11.184 *n=150 t0.025=1.96
Regression Results for Individual Parameters
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 19: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/19.jpg)
Slide 19Example 11-1: Using the
TemplateRegression results for Alka-Seltzer sales
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 20: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/20.jpg)
Slide 20Using the Template: Example 11-
2
Regression results for Exports to Singapore
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 21: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/21.jpg)
Slide 2111-6 Testing the Validity of the
Regression Model: Residual Plots
Residuals vs M1
It appears that the residuals are randomly distributed with no pattern and with equal variance as M1 increases
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 22: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/22.jpg)
Slide 2211-6 Testing the Validity of the
Regression Model: Residual Plots
Residuals vs Price
It appears that the residuals are increasing as the Price increases. The variance of the residuals is not constant.
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 23: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/23.jpg)
Slide 23Normal Probability Plot for the
Residuals: Example 11-2
Linear trend indicates residuals are normally distributed
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 24: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/24.jpg)
Slide 24
.
.
.
...
.
.
....
... .
* Outlier
y
x
Regression line without outlier
Regression line with outlier
Outliers
... .... ... ... . .
Point with a large value of xiy
x
*
Regression line when all data are
included
No relationship in this cluster
Influential Observations
Investigating the Validity of the Regression: Outliers and Influential
Observations
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 25: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/25.jpg)
Slide 25
Unusual ObservationsObs. M1 EXPORTS Fit Stdev.Fit Residual St.Resid 1 5.10 2.6000 2.6420 0.1288 -0.0420 -0.14 X 2 4.90 2.6000 2.6438 0.1234 -0.0438 -0.14 X 25 6.20 5.5000 4.5949 0.0676 0.9051 2.80R 26 6.30 3.7000 4.6311 0.0651 -0.9311 -2.87R 50 8.30 4.3000 5.1317 0.0648 -0.8317 -2.57R 67 8.20 5.6000 4.9474 0.0668 0.6526 2.02R
R denotes an obs. with a large st. resid.X denotes an obs. whose X value gives it large influence.
Outliers and Influential Observations: Example 11-2
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 26: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/26.jpg)
Slide 26
Sales
Advertising
Promotions8.00
18.00
312
63.42
89.76
Estimated Regression Plane for Example 11-1
11-7 Using the Multiple Regression Model for Prediction
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 27: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/27.jpg)
Slide 27
A (1 - a) 100% prediction interval for a value of Y given values of Xi:
A (1 - a) 100% prediction interval for the conditional mean of Y givenvalues of Xi:
( )
[ ( )]
( ,( ( )))
( ,( ( )))
y t s y MSE
y t s E Y
n k
n k
2 1
2
2 1
Prediction in Multiple Regression
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 28: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/28.jpg)
Slide 28
MOVIEEARN COST PROM BOOK 1 28 4.2 1.0 0 2 35 6.0 3.0 1 3 50 5.5 6.0 1 4 20 3.3 1.0 0 5 75 12.5 11.0 1 6 60 9.6 8.0 1 7 15 2.5 0.5 0 8 45 10.8 5.0 0 9 50 8.4 3.0 1 10 34 6.6 2.0 0 11 48 10.7 1.0 1 12 82 11.0 15.0 1 13 24 3.5 4.0 0 14 50 6.9 10.0 0 15 58 7.8 9.0 1 16 63 10.1 10.0 0 17 30 5.0 1.0 1 18 37 7.5 5.0 0 19 45 6.4 8.0 1 20 72 10.0 12.0 1
An indicator (dummy, binary) variable of qualitative level A:
if level A is obtained if level A is not obtained
X h
10
11-8 Qualitative (or Categorical) Independent Variables (in
Regression)
EXAMPLE113Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 29: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/29.jpg)
Slide 29
A multiple regression with two quantitative variables (X1 and X2) and one qualitative variable (X3):
A regression with one quantitative variable (X1) and one qualitative variable (X2):
X1
Y
Line for X2=1
Line for X2=0
b0
b0+b2
x2
x1
y
b3
y b b x b x 0 1 1 2 2y b b x b x b x 0 1 1 2 2 3 3
Picturing Qualitative Variables in Regression
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 30: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/30.jpg)
Slide 30
b0 X1
YLine for X = 0 and X3 = 1
A regression with one quantitative variable (X1) and two qualitative variables (X2 and X2):
b0+b2
b0+b3
Line for X2 = 1 and X3 = 0
Line for X2 = 0 and X3 = 0
A qualitative variable with r
levels or categories is represented with (r-1) 0/1 (dummy)
variables.
Category X2 X3Adventure 0 0Drama 0 1Romance 1 0
y b b x b x b x 0 1 1 2 2 3 3
Picturing Qualitative Variables in Regression: Three Categories and
Two Dummy Variables
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 31: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/31.jpg)
Slide 31
Salary = 8547 + 949 Education + 1258 Experience - 3256 Gender (SE) (32.6) (45.1) (78.5) (212.4) (t) (262.2) (21.0) (16.0) (-15.3)
On average, female salaries are $3256 below male salariesGender
if Femaleif Male
10
Using Qualitative Variables in Regression: Example 11-4
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 32: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/32.jpg)
Slide 32
A regression with interaction between a quantitative variable (X1) and a qualitative variable (X2 ):
X1
YLinforX20
02
0
LinforX21
Slop1
Slop13
y b b x b x b x x 0 1 1 2 2 3 1 2
Interactions between Quantitative and Qualitative Variables: Shifting Slopes
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 33: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/33.jpg)
Slide 33
One-variable polynomial regression model:Y=0+1 X + 2X2 + 3X3 +. . . + mXm +
where m is the degree of the polynomial - the highest power of X appearing in the equation. The degree of the polynomial is the order of the model.
X1
Y
X1
Y
y b b X 0 1
( )
y b b X b Xb
0 1 2
2
20
y b b X 0 1
y b b X b X b X 0 1 2
2
3
3
11-9 Polynomial Regression
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 34: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/34.jpg)
Slide 34Polynomial Regression:
Example 11-5
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 35: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/35.jpg)
Slide 35
Variable Estimate Standard Error T-statistic X1 2.34 0.92 2.54 X2 3.11 1.05 2.96 X1
2 4.22 1.00 4.22 X2
2 3.57 2.12 1.68 X1X2 2.77 2.30 1.20
Polynomial Regression: Other Variables and Cross-Product Terms
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 36: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/36.jpg)
Slide 36
TheY X X XThe
Y X X X
:
multiplicative model
logarithmic transformation
:
log log log log log log
0 1 2 3
0 1 1 2 2 3 3
1 2 3
11-10 Nonlinear Models and Transformations: Multiplicative
Model
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 37: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/37.jpg)
Slide 37
TheY eThe
Y X
X
:
exponential model
logarithmic transformation
:
log log log
0
0 1 1
1
Transformations: Exponential Model
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 38: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/38.jpg)
Slide 38
151050
30
20
10
ADVERT
SALE
S
Sim ple R e gre s s io n of S ale s o n Ad ve rtis ing
3210
3.5
2.5
1.5
LOGADV
LOG
SALE
R e gre s sion of Log(S ale s) on Log(Advertising)
R- S q u a re d = 0 .8 9 5Y = 6 .5 9 2 7 1 + 1.19 176 X
R- Sq uar ed = 0 .9 47Y = 1.70 0 8 2 + 0 .5 53 13 6 X
3210
25
15
5
LOGADV
SALE
S
R- Sq uared = 0 .9 78Y = 3 .6 6 8 2 5 + 6 .78 4 X
Regre s sion of S ale s on Log(Advertising)
22122
1.5
0.5
-0.5
-1.5
Y-HAT
RE
SID
S
R e sidual Plo ts : S ale s vs Log(Advertising)
Plots of Transformed Variables
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 39: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/39.jpg)
Slide 39
• Square root transformation:Useful when the variance of the regression errors is
approximately proportional to the conditional mean of Y• Logarithmic transformation:
Useful when the variance of regression errors is approximately proportional to the square of the conditional mean of Y
• Reciprocal transformation:Useful when the variance of the regression errors is
approximately proportional to the fourth power of the conditional mean of Y
Y Y
Y Ylog( )
YY1
Variance Stabilizing Transformations
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 40: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/40.jpg)
Slide 40
E Y X ee
p pp
X
X( )
log
( )
( )
0 1
0 11
1
y
x
1
0
Logistic Function
The logistic function:
Transformation to linearize the logistic function:
Regression with Dependent Indicator Variables
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 41: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/41.jpg)
Slide 41
x2
x1
Orthogonal X variables provide information from independent sources. No multicollinearity.
x2 x1
Perfectly collinear X variables provide identical information
content. No regression.
Some degree of collinearity. Problems with regression depend
on the degree of collinearity.
x2
x1
A high degree of negative collinearity also causes problems
with regression.
x2x1
11-11: Multicollinearity
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 42: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/42.jpg)
Slide 42
• Variances of regression coefficients are inflated.• Magnitudes of regression coefficients may be different
from what are expected.• Signs of regression coefficients may not be as expected.• Adding or removing variables produces large changes in
coefficients.• Removing a data point may cause large changes in
coefficient estimates or signs.• In some cases, the F ratio may be significant while the t
ratios are not.
Effects of Multicollinearity
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 43: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/43.jpg)
Slide 43Detecting the Existence of Multicollinearity: Correlation Matrix of Independent Variables and Variance Inflation
Factors
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 44: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/44.jpg)
Slide 44
1.00.50.0
100
50
0Rh2
VIFRelationship between VIF and Rh
2
The associated with
where R is the value obtained for the regression of X on the other independent variables.
h2 2
variance inflation factor X
VIF XR
R
h
hh
:
( ) 1
1 2
Variance Inflation Factor
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 45: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/45.jpg)
Slide 45Variance Inflation Factor (VIF)
Observation: The VIF (Variance Inflation Factor) values for both variables Lend and Price are both greater than
5. This would indicate that some degree of multicollinearity exists with respect to these two
variables. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 46: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/46.jpg)
Slide 46
• Drop a collinear variable from the regression
• Change in sampling plan to include elements outside the multicollinearity range
• Transformations of variables• Ridge regression
Solutions to the Multicollinearity Problem
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 47: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/47.jpg)
Slide 47
An autocorrelation is a correlation of the values of a variable with values of the same variable lagged one or more periods back. Consequences of autocorrelation include inaccurate
estimates of variances and inaccurate predictions.
Lagged Residuals
i i i-1 i-2 i-3 i-4 1 1.0 * * * * 2 0.0 1.0 * * *
3 -1.0 0.0 1.0 * * 4 2.0 -1.0 0.0 1.0 *
5 3.0 2.0 -1.0 0.0 1.0 6 -2.0 3.0 2.0 -1.0 0.0 7 1.0 -2.0 3.0 2.0 -1.0 8 1.5 1.0 -2.0 3.0 2.0 9 1.0 1.5 1.0 -2.0 3.010 -2.5 1.0 1.5 1.0 -2.0
The Durbin-Watson test (first-order autocorrelation):
H0: r1 = 0 H1:r1 0
The Durbin-Watson test statistic:
dei eii
n
eii
n
( )12
22
1
11-12 Residual Autocorrelation and the Durbin-Watson Test
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 48: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/48.jpg)
Slide 48
k = 1 k = 2 k = 3 k = 4 k = 5 n dL dU dL dU dL dU dL dU dL dU 15 1.08 1.36 0.95 1.54 0.82 1.75 0.69 1.97 0.56 2.21 16 1.10 1.37 0.98 1.54 0.86 1.73 0.74 1.93 0.62 2.15 17 1.13 1.38 1.02 1.54 0.90 1.71 0.78 1.90 0.67 2.10 18 1.16 1.39 1.05 1.53 0.93 1.69 0.82 1.87 0.71 2.06
. . . . . . . . . . . . . . . . . . 65 1.57 1.63 1.54 1.66 1.50 1.70 1.47 1.73 1.44 1.77 70 1.58 1.64 1.55 1.67 1.52 1.70 1.49 1.74 1.46 1.77 75 1.60 1.65 1.57 1.68 1.54 1.71 1.51 1.74 1.49 1.77 80 1.61 1.66 1.59 1.69 1.56 1.72 1.53 1.74 1.51 1.77 85 1.62 1.67 1.60 1.70 1.57 1.72 1.55 1.75 1.52 1.77 90 1.63 1.68 1.61 1.70 1.59 1.73 1.57 1.75 1.54 1.78 95 1.64 1.69 1.62 1.71 1.60 1.73 1.58 1.75 1.56 1.78100 1.65 1.69 1.63 1.72 1.61 1.74 1.59 1.76 1.57 1.78
Critical Points of the Durbin-Watson Statistic: =0.05, n= Sample Size, k = Number of
Independent Variables
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 49: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/49.jpg)
Slide 49
PositiveAutocorrelation
NegativeAutocorrelation
Test isInconclusive
NoAutocorrelation
Test isInconclusive
0 dL dU 4-dL4-dU 4
For n = 67, k = 4: dU1.73 4-dU2.27 dL1.47 4-
dL2.53 < 2.58 H0 is rejected, and we conclude there is negative first-order
autocorrelation.
Using the Durbin-Watson Statistic
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 50: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/50.jpg)
Slide 50
Full model:Y = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 +
Reduced model:Y = 0 + 1 X1 + 2 X2 +
Partial F test:H0: 3 = 4 = 0
H1: 3 and 4 not both 0
Partial F statistic:
where SSER is the sum of squared errors of the reduced model, SSEF is the sum of squared errors of the full model; MSEF is the mean square error of the full model [MSEF = SSEF/(n-
(k+1))]; r is the number of variables dropped from the full model.
F(r, (n (k 1))
(SSER
SSEF
) / r
MSEF
11-13 Partial F Tests and Variable
Selection Methods
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 51: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/51.jpg)
Slide 51
• All possible regressionsRun regressions with all possible
combinations of independent variables and select best model
Variable Selection Methods
A p-value of 0.001 indicates that we should reject the null hypothesis H0: the slopes for Lend and Exch. are zero.
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 52: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/52.jpg)
Slide 52
• Stepwise proceduresForward selection
» Add one variable at a time to the model, on the basis of its F statistic
Backward elimination» Remove one variable at a time, on the basis of its F
statisticStepwise regression
» Adds variables to the model and subtracts variables from the model, on the basis of the F statistic
Variable Selection Methods
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 53: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/53.jpg)
Slide 53
ComputFttiticforchvrilnotinthmodl
Entrmotignificnt(mlltpvlu)vrilintomodl
ClcultprtilFforllvrilinthmodl
Ithrvrilwithpvlu>Pout?Rmov
vril
Stop
Y
NoIthrtltonvrilwithpvlu>Pin?
No
Stepwise Regression
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 54: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/54.jpg)
Slide 54
MTB > STEPWISE 'EXPORTS' PREDICTORS 'M1’ 'LEND' 'PRICE’ 'EXCHANGE'
Stepwise Regression
F-to-Enter: 4.00 F-to-Remove: 4.00
Response is EXPORTS on 4 predictors, with N = 67
Step 1 2Constant 0.9348 -3.4230
M1 0.520 0.361T-Ratio 9.89 9.21
PRICE 0.0370T-Ratio 9.05
S 0.495 0.331R-Sq 60.08 82.48
Stepwise Regression: Using the Computer (MINITAB)
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 55: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/55.jpg)
Slide 55
MTB > REGRESS 'EXPORTS’ 4 'M1’ 'LEND’ 'PRICE' 'EXCHANGE';SUBC> vif;SUBC> dw.
Regression AnalysisThe regression equation is
EXPORTS = - 4.02 + 0.368 M1 + 0.0047 LEND + 0.0365 PRICE + 0.27 EXCHANGE
Predictor Coef Stdev t-ratio p VIFConstant -4.015 2.766 -1.45 0.152
M1 0.36846 0.06385 5.77 0.000 3.2LEND 0.00470 0.04922 0.10 0.924 5.4PRICE 0.036511 0.009326 3.91 0.000 6.3EXCHANGE 0.268 1.175 0.23 0.820 1.4
s = 0.3358 R-sq = 82.5% R-sq(adj) = 81.4%
Analysis of Variance
SOURCE DF SS MS F pRegression 4 32.9463 8.2366 73.06 0.000
Error 62 6.9898 0.1127Total 66 39.9361
Durbin-Watson statistic = 2.58
Using the Computer: MINITAB
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 56: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/56.jpg)
Slide 56
Parameter Estimates
Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > |T|
INTERCEP 1 -4.015461 2.76640057 -1.452 0.1517 M1 1 0.368456 0.06384841 5.771 0.0001 LEND 1 0.004702 0.04922186 0.096 0.9242 PRICE 1 0.036511 0.00932601 3.915 0.0002 EXCHANGE 1 0.267896 1.17544016 0.228 0.8205
Variance Variable DF Inflation
INTERCEP 1 0.00000000 M1 1 3.20719533
LEND 1 5.35391367 PRICE 1 6.28873181
EXCHANGE 1 1.38570639
Durbin-Watson D 2.583(For Number of Obs.) 67
1st Order Autocorrelation -0.321
Using the Computer: SAS (continued)
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 57: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/57.jpg)
Slide 57
The population regression
yyy
y
x x x xx x x xx x x x
x x x xk
k
k
k
n n n nk
. . . ..
model:
.
.
.
.........
. . . . .
. . . . .
. . . . ..
1
2
3
11 12 13 1
21 22 23 2
31 32 33 3
1 2 3
1
2
111
1
3
1
2
3
.
.
.
.
.
.
k k
Y XThe estimated regression
model:
Y = Xb+ e
11-15: The Matrix Approach to Regression Analysis (1)
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 58: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/58.jpg)
Slide 58
The normal equationsX Xb X Y
Estimatorsb X X X Y
values
Y Xb X X X X Y HYV b X Xs b MSE X X
:
:( )
: ( )( ) ( )( ) ( )
1
1
2 1
2 1
Predicted
The Matrix Approach to Regression Analysis (2)
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
![Page 59: Multiple regression (1)](https://reader036.vdocuments.us/reader036/viewer/2022070509/58a663161a28ab1c5b8b6877/html5/thumbnails/59.jpg)
Slide 59
M.Phil (Statistics)
GC University, . (Degree awarded by GC University)
M.Sc (Statistics) GC University, . (Degree awarded by GC University)
Statitical Officer(BS-17)(Economics & Marketing Division)
Livestock Production Research Institute Bahadurnagar (Okara), Livestock & Dairy Development
Department, Govt. of Punjab
Name Shakeel NoumanReligion ChristianDomicile Punjab (Lahore)Contact # 0332-4462527. 0321-9898767E.Mail [email protected] [email protected]
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer