regression and correlation
DESCRIPTION
Regression and Correlation. Jake Blanchard Fall 2010. Introduction. We can use regression to find relationships between random variables This does not necessarily imply causation Correlation can be used to measure predictability. Regression with Constant Variance. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/1.jpg)
Regression and Correlation
Jake BlanchardFall 2010
![Page 2: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/2.jpg)
IntroductionWe can use regression to find
relationships between random variables
This does not necessarily imply causation
Correlation can be used to measure predictability
![Page 3: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/3.jpg)
Regression with Constant VarianceLinear Regression: E(Y|
X=x)=+xIn general, variance is function of
xIf we assume the variance is a
constant, then the analysis is simplified
Define total absolute error as the sum of the squares of the errors
![Page 4: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/4.jpg)
Linear Regression
n
ii
n
iii
n
iiii
n
iii
n
iii
n
iii
xx
xxyy
xysolve
xyx
xy
xyxy
1
2
1
1
2
1
2
1
2
1
22
02
02
![Page 5: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/5.jpg)
Variance in Regression AnalysisRelevant variance is conditional:
Var(Y|X=x)
2
2|2
22|
1
22
1
22|
1
22|
1
2
2121
Y
XY
XY
n
ii
n
iiXY
n
iiiXY
ss
r
ns
xxyyn
s
xyn
s
![Page 6: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/6.jpg)
Confidence IntervalsRegression coefficients are t-
distributed with n-2 dofStatistic below is thus t-
distributed with n-2 dof
And the confidence interval is
n
ii
ixY
xYi
xx
xxn
s
Yi
1
2
2
|
|
1
n
ii
iXY
nixY
xx
xxn
styi
1
2
2
|2,
211|
1
![Page 7: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/7.jpg)
ExampleExample 8.1Data for compressive strength (q)
of stiff clay as a function of “blow counts” (N)
038.08305.0
2
029.0
112.0
22.191
12.9591123.27.18
22|
22
222
222
ns
Nq
NnNqNnqN
qnqs
NnNs
qN
Nq
i
ii
iq
iN
744.0,21.07.18*104353
7.184101038.*306.2477.
477.04*112.0029.04
306.2
1
95.0|
2
2
95.0|
8,975.0
1
2
2
|2,
211|
Nq
Nq
i
n
ii
iXY
nixY
yNat
t
xx
xxn
styi
![Page 8: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/8.jpg)
Plot
![Page 9: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/9.jpg)
Correlation Estimate
22
2|2
,
,
1,
1,
121
11
11
rss
nn
ss
ss
yxnyx
n
ss
yyxx
n
Y
xYyx
Y
Xyx
YX
n
iii
yx
YX
n
iii
yx
![Page 10: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/10.jpg)
Regression with Non-Constant VarianceNow relax
assumption of constant variance
Assume regions with large conditional variance weighted less
)(2
)(1
)(1
|1
)|()(|
|
1
2
2
22
2
11
2
1
1111
1
11
1
22
22
22
xsgsn
yyws
xgww
xwxww
ywxwyxww
w
xwyw
xyw
xgxXYVarw
weightsxxXYExgxXYVar
xY
n
iii
iii
n
iii
n
iii
n
ii
n
iii
n
iii
n
iiii
n
ii
n
ii
n
iii
n
iii
n
iiii
iii
![Page 11: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/11.jpg)
Example (8.2)Data for maximum settlement (x)
of storage tanks and maximum differential settlement (y)
From looking at data, assume g(x)=x (that is, standard deviation of y increases linearly with x
2
22
1|
ii xw
xxXYVar
![Page 12: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/12.jpg)
Example (8.2) continued
96.0
243.00589.0
65.0045.0
627.0923.011.165.1
|
2
xss
ssyx
xy
y
x
![Page 13: Regression and Correlation](https://reader035.vdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/13.jpg)
Multiple Regression
ikkiii xxxy ...22110
“Nonlinear” Regression
)()|( xgxYE
Use LINEST in Excel