section 12.3 regression analysis hawkes learning systems math courseware specialists copyright ©...
TRANSCRIPT
Section 12.3
Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
Copyright © 2008 by Hawkes Learning
Systems/Quant Systems, Inc.
All rights reserved.
• Residual – the difference in the actual value and the predicted value. Also known as error in the predicted value.
Definitions:
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
Residual y –
Where y the actual value occurring in the population the predicted value occurring in the sample
HAWKES LEARNING SYSTEMS
math courseware specialists
TI-84 Plus Instructions:
1. Press STAT, then EDIT
2. Type the x-variable values into L1
3. Type the y-variable values into L2
4. Then highlight L3 and enter the formula with actual values for b0+b1x . For example, -3.811 + 0.865L1.
5. Then highlight L4 and enter the formula L2-L3
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
Determine the residual:
The table below gives data from a local school district on a child’s age and their reading level. For this data, a reading level of 4.3 would indicate 3/10 of the year through the fourth grade. Children’s ages are given in years. We know the regression line is . Use this equation to calculate an estimate, , for each value of the independent variable, x, and then use the estimate to calculate the residual for each value of y.
Solution:
All calculations can be performed at once on a calculator.
Age 6 7 8 9 10 11 12 13 14 15
Reading Level 1.3 2.2 3.7 4.1 4.9 5.2 6.0 7.1 8.5 9.7
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
The results will be as follows:
Solution (continued):
Age Reading Level Predicted Value Residual
6 1.3 1.379 –0.079
7 2.2 2.244 –0.044
8 3.7 3.109 0.591
9 4.1 3.974 0.126
10 4.9 4.839 0.061
11 5.2 5.704 –0.504
12 6.0 6.569 –0.569
13 7.1 7.434 –0.334
14 8.5 8.299 0.201
15 9.7 9.164 0.536
Regression, Inference, and Model Building
12.3 Regression Analysis
Residual:
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
• The residual of each value reflects how far the original data point is from the point on the regression line.
• Graphically, the residual is the vertical distance from the original data point to the point on the regression line.
Errors Shown Graphically:
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
Sum of Squared Errors:
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
• The value calculated by summing the square of the errors is the sum of squared errors, SSE.
• If the data points are very far from the regression line, then the sum of squared errors will be large. Therefore, the worse the linear model will be at predicting the value of y.
• If the data points are very close to the regression line, then the sum of squared errors will be small. Therefore, the better the linear model will be at predicting the value of y.
• The line that fits the data “best” would be the one with the smallest value of SSE.
HAWKES LEARNING SYSTEMS
math courseware specialists
From the previous example, calculate the sum of squared errors:Calculate the sum of squared errors:
AgeReading
LevelPredicted
ValueResidual
Squared Error
6 1.3 1.379 –0.079 0.00624
7 2.2 2.244 –0.044 0.00194
8 3.7 3.109 0.591 0.34928
9 4.1 3.974 0.126 0.01588
10 4.9 4.839 0.061 0.00372
11 5.2 5.704 –0.504 0.25402
12 6.0 6.569 –0.569 0.32376
13 7.1 7.434 –0.334 0.11156
14 8.5 8.299 0.201 0.04040
15 9.7 9.164 0.536 0.28730
Regression, Inference, and Model Building
12.3 Regression Analysis
Adding the values in the last column we get SSE 1.3941.
Standard Error of Estimate:
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
• The standard error of estimate, Se , is a measure of how much the data points deviate from the regression line.
• This is analogous to how the standard deviation measures how much the data deviates from the sample mean.
• The smaller the value of the standard error of estimate is, the closer the data points are to the regression line.
HAWKES LEARNING SYSTEMS
math courseware specialists
Determine the standard error of estimate:
Calculate the standard error of estimate for the data given previously for age and reading level.
Solution:
n 10, and 1.3941
Age 6 7 8 9 10 11 12 13 14 15
Reading Level 1.3 2.2 3.7 4.1 4.9 5.2 6.0 7.1 8.5 9.7
Regression, Inference, and Model Building
12.3 Regression Analysis
0.417
• Prediction Interval – the confidence interval for the predicted dependent variable.
• Bivariate Normal Distribution – a distribution where any given fixed value of the independent variable, x0, and the possible sample values of the dependent variable, y, are normally distributed about the regression line with the mean of the normal distribution equal to and the standard deviation of the normal distribution the same for each value of x0.
Definitions:
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
Bivariate Normal Distribution:
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
Margin of Error:
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
where d.f.n – 2 = the sample meanx0 = the fixed value of xn = the sample size
HAWKES LEARNING SYSTEMS
math courseware specialists
Prediction Interval for an Individual y:
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
Steps to Determine the Prediction Interval for an Individual y:
Regression, Inference, and Model Building
12.3 Regression Analysis
1. Find the regression equation for the sample data.
2. Use the regression equation to calculate the point estimate for the given value of x.
3. Calculate the sample statistics necessary to calculate the margin of error.
4. Calculate the margin of error.
5. Construct the prediction interval.
HAWKES LEARNING SYSTEMS
math courseware specialists
Construct the prediction interval:
Construct a 95% prediction interval for the reading level of a child who is 8 years old.
Solution:
We know from a previous example that the regression equation is .
Now we will calculate the point estimate for the given value of x.
Age 6 7 8 9 10 11 12 13 14 15
Reading Level 1.3 2.2 3.7 4.1 4.9 5.2 6.0 7.1 8.5 9.7
Regression, Inference, and Model Building
12.3 Regression Analysis
3.109
HAWKES LEARNING SYSTEMS
math courseware specialists
Solution (continued):
Calculate the sample statistics necessary to calculate the margin of error:n 10, 10.5, ∑x105, ∑x 21185, t/2 2.306, Se 0.4174417
Calculate the margin of error:
Regression, Inference, and Model Building
12.3 Regression Analysis
1.044
HAWKES LEARNING SYSTEMS
math courseware specialists
Solution (continued):
Construct the prediction interval:
Regression, Inference, and Model Building
12.3 Regression Analysis
3.109 – 1.044 <y < 3.109 + 1.044
2.065 <y < 4.153
(2.065, 4.153)
HAWKES LEARNING SYSTEMS
math courseware specialists
Confidence Intervals for the Slope and the y-Intercept of the Regression Equation:
Regression, Inference, and Model Building
12.3 Regression Analysis
Using Microsoft Excel, we can construct confidence intervals for the population slope and y-intercept parameters 1 and 0, respectively.
HAWKES LEARNING SYSTEMS
math courseware specialists
Construct the prediction interval:
Construct a 95% confidence interval for the slope, 1, and y-intercept, 0,of the regression equation for age and reading level.
Solution:
Begin by entering the data into Microsoft Excel.
Age 6 7 8 9 10 11 12 13 14 15
Reading Level 1.3 2.2 3.7 4.1 4.9 5.2 6.0 7.1 8.5 9.7
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
Solution (continued):
Next, choose DATA ANALYSIS from the TOOLS menu. Choose REGRESSION from the options listed. Enter the necessary information as shown below.
Regression, Inference, and Model Building
12.3 Regression Analysis
HAWKES LEARNING SYSTEMS
math courseware specialists
Solution (continued):
The results are as follows:
Regression, Inference, and Model Building
12.3 Regression Analysis
“Multiple R” is the correlation coefficient.“R Square” is just that, r 2.“Standard Error” is the standard error estimate, Se.The intersection of the Residual row and SS column is the SSE.
HAWKES LEARNING SYSTEMS
math courseware specialists
Solution (continued):
Regression, Inference, and Model Building
12.3 Regression Analysis
The blue box contains the values for the coefficients in the regression line.b0 – 3.81090909 and b1 0.864848485.
The red box is upper and lower endpoints of the confidence intervals for the y-intercept and slope.
– 4.9645964 <0 < –2.6572218 and 0.758867258 <1 < 0.97082971