collinearity. symptoms of collinearity collinearity between independent variables – high r 2 high...
TRANSCRIPT
![Page 1: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/1.jpg)
Collinearity
![Page 2: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/2.jpg)
Symptoms of collinearity
• Collinearity between independent variables – High r2
• High vif of variables in model• Variables significant in simple regression, but not in
multiple regression• Variables not significant in multiple regression, but multiple
regression model (as whole) significant• Large changes in coefficient estimates between full and
reduced models• Large standard errors in multiple regression models despite
high power
![Page 3: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/3.jpg)
Collinearity and confounding independent variables
Two independent variables, correlated with each other, where
both influence the response
![Page 4: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/4.jpg)
Methods
• Truth: y = 10 + 3x1 + 3x2 + N(0,2)
• x1 = U[0,10]
• x2 = x1 + N(0,z) where• z = U[0.5,20]• Run simple regression between y and x1
• Run multiple regression between y and x1 + x2
• No interactions!
![Page 5: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/5.jpg)
Simple regression: y~x1
0.0 0.2 0.4 0.6 0.8 1.0
01
02
03
04
05
0
Collinearity between x1 and x2
Va
ria
nce
infla
tion
fact
or
![Page 6: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/6.jpg)
Simple regression: y~x1
0.0 0.2 0.4 0.6 0.8 1.0
05
10
15
Collinearity between x1 and x2
Co
effi
cie
nt E
stim
ate
for
X1
![Page 7: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/7.jpg)
Simple regression: y~x1
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
Collinearity between x1 and x2
SE
of E
stim
ate
for
X1
![Page 8: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/8.jpg)
Simple regression: y~x1
0.0 0.2 0.4 0.6 0.8 1.0
1e
-46
1e
-36
1e
-26
1e
-16
1e
-06
Collinearity between x1 and x2
P-v
alu
e fo
r co
effi
cie
nt e
stim
ate
for
x1
![Page 9: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/9.jpg)
Multiple regression: y~x1+x2
0.0 0.2 0.4 0.6 0.8 1.0
12
34
Collinearity between x1 and x2
Co
effi
cie
nt e
stim
ate
for
x1
![Page 10: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/10.jpg)
Multiple regression: y~x1+x2
0.0 0.2 0.4 0.6 0.8 1.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Collinearity between x1 and x2
SE
of e
stim
ate
for
x1
![Page 11: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/11.jpg)
Multiple regression: y~x1+x2
0.0 0.2 0.4 0.6 0.8 1.0
1e
-33
1e
-26
1e
-19
1e
-12
1e
-05
Collinearity between x1 and x2
P-v
alu
e fo
r co
effi
cie
nt o
f est
ima
te fo
r x1
![Page 12: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/12.jpg)
Collinearity and redundant independent variables
Two independent variables, correlated with each other, where only one influences the response,
although we don’t know which one
![Page 13: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/13.jpg)
Methods
• Truth: y = 10 + 3x1 + N(0,2)
• x1 = U[0,10]
• x2 = x1 + N(0,z) where• z = U[0.5,20]• Run simple regression between y and x1
• Run multiple regression between y and x1 + x2
• No interactions!
![Page 14: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/14.jpg)
Simple regression: y~x1
0.0 0.2 0.4 0.6 0.8 1.0
01
02
03
04
05
0
Collinearity between x1 and x2
Va
ria
nce
infla
tion
fact
or
![Page 15: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/15.jpg)
Simple regression: y~x1
0.0 0.2 0.4 0.6 0.8 1.0
2.0
2.5
3.0
3.5
4.0
Collinearity between x1 and x2
Co
effi
cie
nt o
f est
ima
te fo
r x1
![Page 16: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/16.jpg)
Simple regression: y~x1
0.0 0.2 0.4 0.6 0.8 1.0
0.0
60
.08
0.1
00
.12
0.1
4
Collinearity between x1 and x2
SE
for
coe
ffici
en
t of e
stim
ate
for
x1
![Page 17: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/17.jpg)
Simple regression: y~x2
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Collinearity between x1 and x2
Co
effi
cie
nt o
f est
ima
te fo
r x2
![Page 18: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/18.jpg)
Simple regression: y~x2
0.0 0.2 0.4 0.6 0.8 1.0
0.0
50
.10
0.1
50
.20
0.2
50
.30
Collinearity between x1 and x2
SE
for
coe
ffici
en
t of e
stim
ate
for
x2
![Page 19: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/19.jpg)
Simple regression: y~x2
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Collinearity between x1 and x2
P-v
alu
e fo
r co
effi
cie
nt o
f est
ima
te fo
r x2
![Page 20: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/20.jpg)
Multiple regression: y~x1+x2
0.0 0.2 0.4 0.6 0.8 1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
Collinearity between x1 and x2
Co
effi
cie
nt o
f est
ima
te fo
r x1
![Page 21: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/21.jpg)
Multiple regression: y~x1+x2
0.0 0.2 0.4 0.6 0.8 1.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Collinearity between x1 and x2
SE
of e
stim
ate
for
x1
![Page 22: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/22.jpg)
Multiple regression: y~x1+x2
0.0 0.2 0.4 0.6 0.8 1.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
Collinearity between x1 and x2
Co
effi
cie
nt o
f est
ima
te fo
r x2
![Page 23: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/23.jpg)
Multiple regression: y~x1+x2
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
Collinearity between x1 and x2
SE
for
est
ima
te fo
r x2
![Page 24: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/24.jpg)
Multiple regression: y~x1+x2
0.0 0.2 0.4 0.6 0.8 1.0
1e
-33
1e
-26
1e
-19
1e
-12
1e
-05
Collinearity between x1 and x2
P-v
alu
e fo
r e
stim
ate
for
x1
![Page 25: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/25.jpg)
Multiple regression: y~x1+x2
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Collinearity between x1 and x2
P-v
alu
e fo
r e
stim
ate
for
x2
![Page 26: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/26.jpg)
What to do?
• Be sure to calculate collinearity and vif among independent variables (before you start your analysis)
• Pay attention to how coefficient estimates and variable significance change as variables are removed or added
• Be careful to identify potentially confounding variables prior to data collection
![Page 27: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/27.jpg)
Is a variable redundant or confounding?
• Think!• Extreme collinearity
– Redundant• Large changes in coefficient estimates of both variables
between full and reduced models– Confounding
• Large changes in coefficient estimates of one variable between full and reduced models– Redundant – full model estimate close to zero
• Uncertain – assume confounding– Multiple regression always produces unbiased estimates (on
average) regardless of type of collinearity
![Page 28: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/28.jpg)
What to do? Confounding variables
• Be sure to sample in a manner that eliminates collinearity– Collinearity may be due to real collinearity or sampling
artifact• Use multiple regression
– May have large standard errors if strong collinearity• Include confounding variables even if non-
significant• Get more data
– Decreases standard errors (vif)
![Page 29: Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfb5503460f949ccfec/html5/thumbnails/29.jpg)
What to do? Redundant variables
• Determine which variable explains response best using P-values from regression and changes in coefficient estimates with variable addition and removal
• Do not include redundant variable in final model– Reduces vif
• Try a variable reduction technique like PCA