© 2008 pearson addison-wesley. all rights reserved 13-6-1 chapter 1 section 13-6 regression and...

20
© 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

Upload: randolph-arnold

Post on 31-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-1

Chapter 1

Section 13-6Regression and Correlation

Page 2: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-2

Regression and Correlation

• Linear Regression

• Correlation

Page 3: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-3

Regression

One important branch of inferential statistics, called regression analysis, is used to

Page 4: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-4

Regression

Suppose that we wish to get an idea of how the number of hours preparing for a final exam relates to the score on the exam. Data is collected and shown below.

Hours 1 2 3 4 5 6 7 8 9 10

Score 50 62 62 74 70 86 78 90 96 94

Page 5: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-5

Linear Regression

The first step in analyzing these data is to graph the results as shown in the scatter diagram on the next slide.

Page 6: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-6

Scatter Diagram

0

20

40

60

80

100

120

0 5 10 15

Hours Studying

Ex

am

Sc

ore

Page 7: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-7

Linear Regression

Once a scatter diagram has been produce, we can draw a curve that best fits the pattern exhibited by the sample points. The best-fitting curve for the sample points is called an estimated regression curve. If the points in the scatter diagram seem to lie approximately along a straight line, the relationship is assumed to be linear, and the line that best fits the data points is called the estimated linear regression.

Page 8: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-8

Estimated Regression Line

0

20

40

60

80

100

120

0 5 10 15

Hours Studying

Ex

am

Sc

ore

Page 9: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-9

Linear Regression

If we let x denote hours studying and y denote exam score in the data of the previous slide and assume that the best-fitting curve is a line, then the equation of that line will take the form

y = ax + b,

where a is the slope of the line and b is the y-coordinate of the y-intercept. To identify the estimated regression line, we must find the values of the “regression coefficients” a and b.

Page 10: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-10

Linear Regression

For each x-value in the data set, the corresponding y-value usually differs from the value it would have if the data point were exactly on the line. These differences are shown in the figure by vertical line segments. The most common procedure is to choose the line where the sum of the squares of all these differences is minimized. This is called the method of least squares, and the resulting line is called the least squares line.

Page 11: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-11

Regression Coefficient Formulas

22

and .n xy x y y a x

a bnn x x

The least squares line y’ = ax + b that provides the best fit to the data points (x1, y1), (x2, y2),… (xn, yn) has

Page 12: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-12

Example: Computing a Least Squares Line

Find the equation of the least squares line for the hours and exam score data.

Hours 1 2 3 4 5 6 7 8 9 10

Score 50 62 62 74 70 86 78 90 96 94

Page 13: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-13

Example: Computing a Least Squares Line

Solution

Page 14: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-14

Example: Predicting from a Least Squares Line

Use the result from the previous example to predict the exam score for a student that studied 6.5 hours.

Solution

Page 15: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-15

Correlation

One common measure of the strength of the linear relationship in the sample is called the sample correlation coefficient, denoted r. It is calculated from the sample data according to the formula on the next slide.

Page 16: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-16

Sample Correlation Coefficient Formula

In linear regression, the strength of the linear relationship is measured by the correlation coefficient

2 22 2

n xy x yr

n x x n y y

r is always between –1 and 1, or perhaps equal to –1 or 1.

Page 17: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-17

Correlation Coefficient

Values of exactly 1 or –1 indicate that the least squares line goes exactly through all the data points. If r is close to 1 or –1, but not exactly equal, then the line comes “close,” and the linear correlation between x and y is “strong.” If r is equal, or nearly equal, to 0, there is no linear correlation or the correlation is weak. If r is neither close to 0 nor close to 1 or –1, we might describe the linear correlation as “moderate.”

Page 18: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-18

Correlation Coefficient

A positive value of r indicates that the linear relationship between x and y is direct; as x increases, y also increases. A negative value of r indicates that there is an inverse relationship between x and y; as x increases, y decreases.

Page 19: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-19

Example: Finding a Correlation Coefficient

Find r for the data.

Hours 1 2 3 4 5 6 7 8 9 10

Score 50 62 62 74 70 86 78 90 96 94

Solution

Page 20: © 2008 Pearson Addison-Wesley. All rights reserved 13-6-1 Chapter 1 Section 13-6 Regression and Correlation

© 2008 Pearson Addison-Wesley. All rights reserved

13-6-20

Example: Finding a Correlation Coefficient

Solution (continued)