Download - Simple Linear Regression
![Page 1: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/1.jpg)
Simple Linear Regression
![Page 2: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/2.jpg)
Start by exploring the data
Construct a scatterplot Does a linear relationship between variables
exist? Is the relationship strong? How much variation can be explained by a linear
relationship with the independent or explanatory variable?
![Page 3: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/3.jpg)
Beers and BAC
987654321
0.2
0.1
0.0
Beers
BA
C
S = 0.0204410 R-Sq = 80.0 % R-Sq(adj) = 78.6 %
BAC = -0.0127006 + 0.0179638 Beers
Regression Plot
![Page 4: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/4.jpg)
Variance “Candy Bar”
Explained Unexplained
•The R-sq value: estimates the percentage of variation explained by a linear relationship with the independent or explanatory variable. Unless this estimate is 100% (or very near), it is not sufficient on its own.
•The amounts of explained and unexplained information due to the model are measured by Sums of Squares
![Page 5: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/5.jpg)
Decomposition of information into explained and unexplained parts
![Page 6: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/6.jpg)
Residuals
A residualresidual is the difference between an observed value of the dependent variable and the value predicted by the regression line.
Residual = (observed y) - (predicted y)=
y – ŷ
They help us assess the fit of a regression line.
![Page 7: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/7.jpg)
Variance “Candy Bar”
Explained Unexplained
222 )()ˆ()ˆ( yyyyyy
SS explained by model
SS TotalSS Error
Systematic SS + Random SS = Total SS
![Page 8: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/8.jpg)
Model Assumptions about the residuals (ε) The distribution is NORMAL The mean is ZERO The variance is CONSTANT for all values of x
(σ2) Errors associated with any two observations are
independent
![Page 9: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/9.jpg)
Assessing the utility of the model: model variance Variance is variability of the random error (σ2) The higher the variability of the random error, the
greater the error of prediction σ2 is estimated with s2 (often called the mean square
for error, MSE) Variance: s2= SSE/degrees of freedom (n-2)
Standard error: This is like standard deviation; with standard error, we are
looking at deviation from the line Approximately 95% of observed y values will lie within 2s of
their respective predicted values
2ss
![Page 10: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/10.jpg)
Assessing the utility of the model: Slope Does y change as x changes? Does x
contribute information for the prediction of y?
Test this with the t-statistic or p-value (p<.05); these values are
included in software output
0: 1 aH
0: 10 H
1
1
b
btSE
![Page 11: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/11.jpg)
Assessing the utility of the model: Correlation Coefficient r Measure of the strength and direction of the
linear relationship between x and y Always between -1 and +1 High correlation does not imply causality
![Page 12: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/12.jpg)
Assessing the utility of the model: Coefficient of Determination (r2) The R squared value is the % of the variation in y
explained by the model.
For linear regression, the higher the value, the better the model.
yy
yy
SS
SSESSr
yvariabilit sample Total
yvariabilit sample Explained2
![Page 13: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/13.jpg)
Using the model for estimation and prediction: Confidence interval for mean response For any specific value of x:
A confidence interval for adds to this estimate a margin of error based on the standard error .
Confidence intervals widen as the value of x is further from its mean.
*10 xbby
SE
![Page 14: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/14.jpg)
Confidence interval for mean response
987654321
0.2
0.1
0.0
Beers
BA
C
S = 0.0204410 R-Sq = 80.0 % R-Sq(adj) = 78.6 %
BAC = -0.0127006 + 0.0179638 Beers
95% CI
Regression
Regression Plot
![Page 15: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/15.jpg)
Prediction interval for a future observation Similar to confidence interval for mean
response Standard error used in prediction
interval includes Variability due to the fact that the least-
squares line is not exactly equal to the true regression line
Variability of the future response variable y around the subpopulation mean.
ySE ˆ
![Page 16: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/16.jpg)
Prediction interval for a future observation
987654321
0.2
0.1
0.0
Beers
BA
C
S = 0.0204410 R-Sq = 80.0 % R-Sq(adj) = 78.6 %
BAC = -0.0127006 + 0.0179638 Beers
95% PI
95% CI
Regression
Regression Plot
![Page 17: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/17.jpg)
In the MINITAB regression window, you might want to… Set confidence levels in Options Enter a value for prediction in Options Store Residuals and Fits in Storage Display full table of fits and residuals in
Results (select last bullet)
![Page 18: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/18.jpg)
Beware of Extrapolation
Extrapolation is the use of a regression line for prediction far outside the range of values of the independent variable x that you used to obtain the line. Such predictions are not accurate.
![Page 19: Simple Linear Regression](https://reader036.vdocuments.us/reader036/viewer/2022083004/56812a5f550346895d8dcea1/html5/thumbnails/19.jpg)
Example from book: p. 138
How can we tell if it is reasonable to fit a linear regression model?
Let’s run the analysis and interpret the results