regression. correlation measures the strength of the linear relationship great! but what is that...
TRANSCRIPT
![Page 1: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/1.jpg)
Regression
![Page 2: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/2.jpg)
Regression
• Correlation measures the strength of the linear relationship
• Great! But what is that relationship? How do we describe it?
– regression, regression line, regression equation
• Regression line is used for prediction
![Page 3: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/3.jpg)
Predicting weights from heights• Independent variable: height• Dependent variable: weight• How can we predict one from the other ?• Regression is to a scatter plot as the mean is to a
histogram.
![Page 4: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/4.jpg)
Weights vs. Heights
![Page 5: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/5.jpg)
YRS EM
302520151050-5
SA
LA
RY
70000
60000
50000
40000
30000
20000
Salary by years employed
![Page 6: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/6.jpg)
Regression by local averages
Approximation ofLocal averages by regression line
Inappropriate useof regression line(use other methods)
![Page 7: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/7.jpg)
The equation of a line
• a represents the y-intercept
– when x equals zero, y equals a
– Is this always meaningful in the context of a problem?
– Is it always useful in defining a line?
• b represents the slope of the line (rise/run)
– for every unit change in x, y changes by b.
– Does this mean that if we physically change x by one unit, y will change by b units? Say we gain another year of experience. Will our salary go up by 1107?
bxay
![Page 8: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/8.jpg)
Regression equation• What is the predicted weight of somebody
whose height is h cm ?
• w = intercept + slope x h
• This is known as the regression equation.
• How do we get this formula ?
• We have a statistical model
![Page 9: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/9.jpg)
YRS EM
302520151050-5
SA
LAR
Y
70000
60000
50000
40000
30000
20000
A residual
xy 110728394
line regression gives Minimising
errors, squared of sum theMinimise 2i
Regression line by minimising residual errors
iii bxay i = error of i-th obs from regression line •The best candidate line willminimise these errors•No line can make all errors vanish (some +ve, some –ve)
![Page 10: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/10.jpg)
Regression and correlation• Want to predict weight for those people who are 1 SD
more than avg. height.
• SD line says:• pred. wt. = overall avg. wt. + SD of wt.
• Regression line says:• Predicted wt. = overall avg. wt. + r x SD of wt.• • For people who are k SDs away from avg. height:• Predicted wt. = overall avg. wt. + r x k SD of wt.• Clearly valid for r 0 or r 1
![Page 11: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/11.jpg)
RMS error of regression
• RMS error = SD of y
• RMS inversely related to correlation
21 r
RMS error is to regression what SD is to average
![Page 12: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/12.jpg)
Residuals
residual =observed -predicted
![Page 13: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/13.jpg)
Example: ozone vs. temperature> air[,c(1,3)]
ozone temperature
3.45 67
3.30 72
2.29 74
2.62 62
2.84 65
. . .> cor(ozone,temperature)
[1] 0.7531038
![Page 14: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/14.jpg)
Fitting a regression model in S> ozone.lm <- lm(ozone ~ temperature, data = air)
Coefficients:
. Value Std. Error tvalue Pr(>|t|)
(Intercept) -2.23 0.46 -4.82 0.0000
temperature 0.07 0.01 11.95 0.0000
Multiple R-Squared: 0.5672
> var(ozone)
[1] 0.7928069
> var(resid(ozone.lm))
[1] 0.3431544
> cor(ozone,temperature)
[1] 0.7531038
![Page 15: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/15.jpg)
Checking model appropriatenessWhat assumptions have we made in the regression model ?
Checking model assumptions in S-plus
> par(mfrow=c(2,3))
> plot(ozone.lm)
![Page 16: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/16.jpg)
Fitted : temperature
Res
idua
ls
2.0 2.5 3.0 3.5 4.0 4.5
-10
12
45
23
77
fitssq
rt(a
bs(R
esid
uals
))
2.0 2.5 3.0 3.5 4.0 4.5
0.2
0.4
0.6
0.8
1.0
1.2
1.4
4523
77
Fitted : temperature
ozon
e
2.0 2.5 3.0 3.5 4.0 4.5
12
34
5
Quantiles of Standard Normal
Res
idua
ls
-2 -1 0 1 2
-10
12
45
23
77
Fitted Values
0.0 0.4 0.8
-10
12
Residuals
0.0 0.4 0.8
-10
12
f-value
ozon
e
Index
Coo
k's
Dis
tanc
e0 20 40 60 80 100
0.0
0.02
0.04
0.06 17 77
20
Residual diagnostics for ozone data
![Page 17: Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? –regression, regression](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d345503460f94a0b984/html5/thumbnails/17.jpg)
Pizza party at the Frat.• How many laps would you
predict a pledge could run if he ate 6 slices of pizza?
• How many laps if he ate 9 slices of pizza?
• A pledge shows off and eats 35 slices of pizza. How many laps would you predict he would run? SLICES
121086420D
ISTA
NC
E
20
18
16
14
12
10
8
6
4
2
965.0
5.120
r
xy
Beware of extrapolation