© the mcgraw-hill companies, inc., 2000 business and finance college principles of statistics...

27
© The McGraw-Hill Companies, Inc., 2000 Business and Finance College Business and Finance College Principles of Statistics Principles of Statistics Lecture 10 Lecture 10 aaed EL Rabai aaed EL Rabai week 12- 2010 week 12- 2010

Upload: asher-smith

Post on 28-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

© The McGraw-Hill Companies, Inc., 2000

Business and Finance College Business and Finance College Principles of StatisticsPrinciples of Statistics

Lecture 10Lecture 10

aaed EL Rabai aaed EL Rabai week 12- 2010week 12- 2010

© The McGraw-Hill Companies, Inc., 2000

lecture 10lecture 10

Correlation and Correlation and RegressionRegression

© The McGraw-Hill Companies, Inc., 2000

4-24-2 OutlineOutline

11-1 Introduction

11-2 Scatter Plots

11-3 Correlation

11-4 Regression

© The McGraw-Hill Companies, Inc., 2000

4-34-3 OutlineOutline

11-5 Coefficient of

Determination and

Standard Error of

Estimate

© The McGraw-Hill Companies, Inc., 2000

11-411-4 ObjectivesObjectives

Draw a scatter plot for a set of ordered pairs.

Find the correlation coefficient. Test the hypothesis H0: = 0. Find the equation of the

regression line.

© The McGraw-Hill Companies, Inc., 2000

11-511-5 ObjectivesObjectives

Find the coefficient of determination.

Find the standard error of estimate.

© The McGraw-Hill Companies, Inc., 2000

11-611-6 11-2 Scatter Plots11-2 Scatter Plots

AA scatter plotscatter plot is a graph of the ordered pairs ((x, yx, y)) of numbers consisting of the independent variable, xx, and the dependent variable, yy.

© The McGraw-Hill Companies, Inc., 2000

11-711-7 11-2 Scatter Plots -11-2 Scatter Plots - Example

Construct a scatter plot for the data obtained in a study of age and systolic blood pressure of six randomly selected subjects.

The data is given on the next slide.

© The McGraw-Hill Companies, Inc., 2000

11-811-8 11-2 Scatter Plots -11-2 Scatter Plots - Example

Subject Age, x Pressure, y

A 43 128

B 48 120

C 56 135

D 61 143

E 67 141

F 70 152

© The McGraw-Hill Companies, Inc., 2000

11-911-9 11-2 Scatter Plots -11-2 Scatter Plots - Example

70605040

150

140

130

120

Age

Pre

ssur

e

70605040

150

140

130

120

Age

Pre

ssur

ePositive Relationship

© The McGraw-Hill Companies, Inc., 2000

11-1011-10 11-2 Scatter Plots -11-2 Scatter Plots - Other Examples

15105

90

80

70

60

50

40

Number of absences

Fin

al g

rade

15105

90

80

70

60

50

40

Number of absences

Fin

al g

rade

Negative Relationship

© The McGraw-Hill Companies, Inc., 2000

11-1111-1111-2 Scatter Plots -11-2 Scatter Plots - Other Examples

706050403020100

10

5

0

X

Y

706050403020100

10

5

0

x

yNo Relationship

© The McGraw-Hill Companies, Inc., 2000

11-1211-12 11-3 Correlation Coefficient11-3 Correlation Coefficient

The correlation coefficientcorrelation coefficient computed from the sample data measures the strength and direction of a relationship between two variables.

Sample correlation coefficient, r. Population correlation coefficient,

© The McGraw-Hill Companies, Inc., 2000

11-1311-1311-3 Range of Values for the 11-3 Range of Values for the

Correlation CoefficientCorrelation Coefficient

Strong negativerelationship

Strong positiverelationship

No linearrelationship

© The McGraw-Hill Companies, Inc., 2000

11-1411-1411-3 Formula for the Correlation 11-3 Formula for the Correlation

Coefficient Coefficient rr

r

n xy x y

n x x n y y

2 2 2 2

Where n is the number of data pairs

© The McGraw-Hill Companies, Inc., 2000

11-1511-1511-3 Correlation Coefficient - 11-3 Correlation Coefficient -

Example (Verify)

Compute the correlation coefficientcorrelation coefficient for the age and blood pressure data.

.897.0

.443 112 ,399 20

634 47= ,819= ,345

22

r

givesrforformulatheinngSubstituti

yx

xyyx

© The McGraw-Hill Companies, Inc., 2000

Business and Finance College Business and Finance College Principles of StatisticsPrinciples of Statistics

Lecture 11Lecture 11

aaed EL Rabai aaed EL Rabai week 11- 2010week 11- 2010

© The McGraw-Hill Companies, Inc., 2000

11-1611-1611-3 The Significance of the 11-3 The Significance of the

Correlation Coefficient Correlation Coefficient

The population correlation population correlation coefficientcoefficient, , is the correlation between all possible pairs of data values (x, y) taken from a population.

© The McGraw-Hill Companies, Inc., 2000

11-1711-1711-3 The Significance of the 11-3 The Significance of the

Correlation Coefficient Correlation Coefficient

H0: = 0 H1: 0 This tests for a significant

correlation between the variables in the population.

© The McGraw-Hill Companies, Inc., 2000

11-2211-22

The scatter plot for the age and blood pressure data displays a linear pattern.

We can model this relationship with a straight line.

This regression line is called the line of best fit or the regression line.

The equation of the line is y = a + bx.

11-4 Regression11-4 Regression

© The McGraw-Hill Companies, Inc., 2000

11-2311-2311-4 Formulas for the Regression 11-4 Formulas for the Regression

Line Line y = a + bx.

ay x x xy

n x x

bn xy x y

n x x

2

2 2

2 2

Where a is the y intercept and b is the slope of the line.

© The McGraw-Hill Companies, Inc., 2000

11-2411-24 11-411-4 Example

Find the equation of the regression line for the age and the blood pressure data.

Substituting into the formulas give a = 81.048 and b = 0.964 (verify).

Hence, y = 81.048 + 0.964x. Note, aa represents the interceptintercept and bb

the slopeslope of the line.

© The McGraw-Hill Companies, Inc., 2000

11-2511-25 11-411-4 Example

70605040

150

140

130

120

Age

Pre

ssur

e

70605040

150

140

130

120

Age

Pre

ssur

e

y = 81.048 + 0.964x

© The McGraw-Hill Companies, Inc., 2000

11-2611-2611-4 Using the Regression Line to11-4 Using the Regression Line to Predict Predict

The regression line can be used to predict a value for the dependent variable (y) for a given value of the independent variable (x).

Caution:Caution: Use x values within the experimental region when predicting y values.

© The McGraw-Hill Companies, Inc., 2000

11-2711-27 11-411-4 Example

Use the equation of the regression line to predict the blood pressure for a person who is 50 years old.

Since y = 81.048 + 0.964x, theny = 81.048 + 0.964(50) = 129.248 129.2

Note that the value of 50 is within the range of x values.

© The McGraw-Hill Companies, Inc., 2000

11-2811-2811-5 Coefficient of Determination 11-5 Coefficient of Determination and Standard Error of Estimateand Standard Error of Estimate

The coefficient of determinationcoefficient of determination, denoted by r2, is a measure of the variation of the dependent variable that is explained by the regression line and the independent variable.

© The McGraw-Hill Companies, Inc., 2000

11-2911-2911-5 Coefficient of Determination 11-5 Coefficient of Determination and Standard Error of Estimateand Standard Error of Estimate

r2 is the square of the correlation coefficient.

The coefficient of coefficient of nondeterminationnondetermination is (1 – r2).

Example: If r = 0.90, then r2 = 0.81.