section 12.3 regression analysis hawkes learning systems math courseware specialists copyright ©...

24
Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved.

Upload: christian-young

Post on 05-Jan-2016

225 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

Section 12.3

Regression Analysis

HAWKES LEARNING SYSTEMS

math courseware specialists

Copyright © 2008 by Hawkes Learning

Systems/Quant Systems, Inc.

All rights reserved.

Page 2: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

• Residual – the difference in the actual value and the predicted value. Also known as error in the predicted value.

Definitions:

Regression, Inference, and Model Building

12.3 Regression Analysis

HAWKES LEARNING SYSTEMS

math courseware specialists

Residual y –

Where y the actual value occurring in the population the predicted value occurring in the sample

Page 3: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

TI-84 Plus Instructions:

1. Press STAT, then EDIT

2. Type the x-variable values into L1

3. Type the y-variable values into L2

4. Then highlight L3 and enter the formula with actual values for b0+b1x . For example, -3.811 + 0.865L1.

5. Then highlight L4 and enter the formula L2-L3

Regression, Inference, and Model Building

12.3 Regression Analysis

Page 4: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Determine the residual:

The table below gives data from a local school district on a child’s age and their reading level. For this data, a reading level of 4.3 would indicate 3/10 of the year through the fourth grade. Children’s ages are given in years. We know the regression line is . Use this equation to calculate an estimate, , for each value of the independent variable, x, and then use the estimate to calculate the residual for each value of y.

Solution:

All calculations can be performed at once on a calculator.

Age 6 7 8 9 10 11 12 13 14 15

Reading Level 1.3 2.2 3.7 4.1 4.9 5.2 6.0 7.1 8.5 9.7

Regression, Inference, and Model Building

12.3 Regression Analysis

Page 5: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

The results will be as follows:

Solution (continued):

Age Reading Level Predicted Value Residual

6 1.3 1.379 –0.079

7 2.2 2.244 –0.044

8 3.7 3.109 0.591

9 4.1 3.974 0.126

10 4.9 4.839 0.061

11 5.2 5.704 –0.504

12 6.0 6.569 –0.569

13 7.1 7.434 –0.334

14 8.5 8.299 0.201

15 9.7 9.164 0.536

Regression, Inference, and Model Building

12.3 Regression Analysis

Page 6: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

Residual:

Regression, Inference, and Model Building

12.3 Regression Analysis

HAWKES LEARNING SYSTEMS

math courseware specialists

• The residual of each value reflects how far the original data point is from the point on the regression line.

• Graphically, the residual is the vertical distance from the original data point to the point on the regression line.

Page 7: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

Errors Shown Graphically:

Regression, Inference, and Model Building

12.3 Regression Analysis

HAWKES LEARNING SYSTEMS

math courseware specialists

Page 8: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

Sum of Squared Errors:

Regression, Inference, and Model Building

12.3 Regression Analysis

HAWKES LEARNING SYSTEMS

math courseware specialists

• The value calculated by summing the square of the errors is the sum of squared errors, SSE.

• If the data points are very far from the regression line, then the sum of squared errors will be large. Therefore, the worse the linear model will be at predicting the value of y.

• If the data points are very close to the regression line, then the sum of squared errors will be small. Therefore, the better the linear model will be at predicting the value of y.

• The line that fits the data “best” would be the one with the smallest value of SSE.

Page 9: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

From the previous example, calculate the sum of squared errors:Calculate the sum of squared errors:

AgeReading

LevelPredicted

ValueResidual

Squared Error

6 1.3 1.379 –0.079 0.00624

7 2.2 2.244 –0.044 0.00194

8 3.7 3.109 0.591 0.34928

9 4.1 3.974 0.126 0.01588

10 4.9 4.839 0.061 0.00372

11 5.2 5.704 –0.504 0.25402

12 6.0 6.569 –0.569 0.32376

13 7.1 7.434 –0.334 0.11156

14 8.5 8.299 0.201 0.04040

15 9.7 9.164 0.536 0.28730

Regression, Inference, and Model Building

12.3 Regression Analysis

Adding the values in the last column we get SSE 1.3941.

Page 10: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

Standard Error of Estimate:

Regression, Inference, and Model Building

12.3 Regression Analysis

HAWKES LEARNING SYSTEMS

math courseware specialists

• The standard error of estimate, Se , is a measure of how much the data points deviate from the regression line.

• This is analogous to how the standard deviation measures how much the data deviates from the sample mean.

• The smaller the value of the standard error of estimate is, the closer the data points are to the regression line.

Page 11: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Determine the standard error of estimate:

Calculate the standard error of estimate for the data given previously for age and reading level.

Solution:

n 10, and 1.3941

Age 6 7 8 9 10 11 12 13 14 15

Reading Level 1.3 2.2 3.7 4.1 4.9 5.2 6.0 7.1 8.5 9.7

Regression, Inference, and Model Building

12.3 Regression Analysis

0.417

Page 12: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

• Prediction Interval – the confidence interval for the predicted dependent variable.

• Bivariate Normal Distribution – a distribution where any given fixed value of the independent variable, x0, and the possible sample values of the dependent variable, y, are normally distributed about the regression line with the mean of the normal distribution equal to and the standard deviation of the normal distribution the same for each value of x0.

Definitions:

Regression, Inference, and Model Building

12.3 Regression Analysis

HAWKES LEARNING SYSTEMS

math courseware specialists

Page 13: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

Bivariate Normal Distribution:

Regression, Inference, and Model Building

12.3 Regression Analysis

HAWKES LEARNING SYSTEMS

math courseware specialists

Page 14: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

Margin of Error:

Regression, Inference, and Model Building

12.3 Regression Analysis

HAWKES LEARNING SYSTEMS

math courseware specialists

where d.f.n – 2 = the sample meanx0 = the fixed value of xn = the sample size

Page 15: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Prediction Interval for an Individual y:

Regression, Inference, and Model Building

12.3 Regression Analysis

Page 16: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Steps to Determine the Prediction Interval for an Individual y:

Regression, Inference, and Model Building

12.3 Regression Analysis

1. Find the regression equation for the sample data.

2. Use the regression equation to calculate the point estimate for the given value of x.

3. Calculate the sample statistics necessary to calculate the margin of error.

4. Calculate the margin of error.

5. Construct the prediction interval.

Page 17: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Construct the prediction interval:

Construct a 95% prediction interval for the reading level of a child who is 8 years old.

Solution:

We know from a previous example that the regression equation is .

Now we will calculate the point estimate for the given value of x.

Age 6 7 8 9 10 11 12 13 14 15

Reading Level 1.3 2.2 3.7 4.1 4.9 5.2 6.0 7.1 8.5 9.7

Regression, Inference, and Model Building

12.3 Regression Analysis

3.109

Page 18: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Solution (continued):

Calculate the sample statistics necessary to calculate the margin of error:n 10, 10.5, ∑x105, ∑x 21185, t/2 2.306, Se 0.4174417

Calculate the margin of error:

Regression, Inference, and Model Building

12.3 Regression Analysis

1.044

Page 19: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Solution (continued):

Construct the prediction interval:

Regression, Inference, and Model Building

12.3 Regression Analysis

3.109 – 1.044 <y < 3.109 + 1.044

2.065 <y < 4.153

(2.065, 4.153)

Page 20: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Confidence Intervals for the Slope and the y-Intercept of the Regression Equation:

Regression, Inference, and Model Building

12.3 Regression Analysis

Using Microsoft Excel, we can construct confidence intervals for the population slope and y-intercept parameters 1 and 0, respectively.

Page 21: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Construct the prediction interval:

Construct a 95% confidence interval for the slope, 1, and y-intercept, 0,of the regression equation for age and reading level.

Solution:

Begin by entering the data into Microsoft Excel.

Age 6 7 8 9 10 11 12 13 14 15

Reading Level 1.3 2.2 3.7 4.1 4.9 5.2 6.0 7.1 8.5 9.7

Regression, Inference, and Model Building

12.3 Regression Analysis

Page 22: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Solution (continued):

Next, choose DATA ANALYSIS from the TOOLS menu. Choose REGRESSION from the options listed. Enter the necessary information as shown below.

Regression, Inference, and Model Building

12.3 Regression Analysis

Page 23: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Solution (continued):

The results are as follows:

Regression, Inference, and Model Building

12.3 Regression Analysis

“Multiple R” is the correlation coefficient.“R Square” is just that, r 2.“Standard Error” is the standard error estimate, Se.The intersection of the Residual row and SS column is the SSE.

Page 24: Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All

HAWKES LEARNING SYSTEMS

math courseware specialists

Solution (continued):

Regression, Inference, and Model Building

12.3 Regression Analysis

The blue box contains the values for the coefficients in the regression line.b0 – 3.81090909 and b1 0.864848485.

The red box is upper and lower endpoints of the confidence intervals for the y-intercept and slope.

– 4.9645964 <0 < –2.6572218 and 0.758867258 <1 < 0.97082971