© the mcgraw-hill companies, inc., 2000 business and finance college principles of statistics...
TRANSCRIPT
© The McGraw-Hill Companies, Inc., 2000
Business and Finance College Business and Finance College Principles of StatisticsPrinciples of Statistics
Lecture 10Lecture 10
aaed EL Rabai aaed EL Rabai week 12- 2010week 12- 2010
© The McGraw-Hill Companies, Inc., 2000
lecture 10lecture 10
Correlation and Correlation and RegressionRegression
© The McGraw-Hill Companies, Inc., 2000
4-24-2 OutlineOutline
11-1 Introduction
11-2 Scatter Plots
11-3 Correlation
11-4 Regression
© The McGraw-Hill Companies, Inc., 2000
4-34-3 OutlineOutline
11-5 Coefficient of
Determination and
Standard Error of
Estimate
© The McGraw-Hill Companies, Inc., 2000
11-411-4 ObjectivesObjectives
Draw a scatter plot for a set of ordered pairs.
Find the correlation coefficient. Test the hypothesis H0: = 0. Find the equation of the
regression line.
© The McGraw-Hill Companies, Inc., 2000
11-511-5 ObjectivesObjectives
Find the coefficient of determination.
Find the standard error of estimate.
© The McGraw-Hill Companies, Inc., 2000
11-611-6 11-2 Scatter Plots11-2 Scatter Plots
AA scatter plotscatter plot is a graph of the ordered pairs ((x, yx, y)) of numbers consisting of the independent variable, xx, and the dependent variable, yy.
© The McGraw-Hill Companies, Inc., 2000
11-711-7 11-2 Scatter Plots -11-2 Scatter Plots - Example
Construct a scatter plot for the data obtained in a study of age and systolic blood pressure of six randomly selected subjects.
The data is given on the next slide.
© The McGraw-Hill Companies, Inc., 2000
11-811-8 11-2 Scatter Plots -11-2 Scatter Plots - Example
Subject Age, x Pressure, y
A 43 128
B 48 120
C 56 135
D 61 143
E 67 141
F 70 152
© The McGraw-Hill Companies, Inc., 2000
11-911-9 11-2 Scatter Plots -11-2 Scatter Plots - Example
70605040
150
140
130
120
Age
Pre
ssur
e
70605040
150
140
130
120
Age
Pre
ssur
ePositive Relationship
© The McGraw-Hill Companies, Inc., 2000
11-1011-10 11-2 Scatter Plots -11-2 Scatter Plots - Other Examples
15105
90
80
70
60
50
40
Number of absences
Fin
al g
rade
15105
90
80
70
60
50
40
Number of absences
Fin
al g
rade
Negative Relationship
© The McGraw-Hill Companies, Inc., 2000
11-1111-1111-2 Scatter Plots -11-2 Scatter Plots - Other Examples
706050403020100
10
5
0
X
Y
706050403020100
10
5
0
x
yNo Relationship
© The McGraw-Hill Companies, Inc., 2000
11-1211-12 11-3 Correlation Coefficient11-3 Correlation Coefficient
The correlation coefficientcorrelation coefficient computed from the sample data measures the strength and direction of a relationship between two variables.
Sample correlation coefficient, r. Population correlation coefficient,
© The McGraw-Hill Companies, Inc., 2000
11-1311-1311-3 Range of Values for the 11-3 Range of Values for the
Correlation CoefficientCorrelation Coefficient
Strong negativerelationship
Strong positiverelationship
No linearrelationship
© The McGraw-Hill Companies, Inc., 2000
11-1411-1411-3 Formula for the Correlation 11-3 Formula for the Correlation
Coefficient Coefficient rr
r
n xy x y
n x x n y y
2 2 2 2
Where n is the number of data pairs
© The McGraw-Hill Companies, Inc., 2000
11-1511-1511-3 Correlation Coefficient - 11-3 Correlation Coefficient -
Example (Verify)
Compute the correlation coefficientcorrelation coefficient for the age and blood pressure data.
.897.0
.443 112 ,399 20
634 47= ,819= ,345
22
r
givesrforformulatheinngSubstituti
yx
xyyx
© The McGraw-Hill Companies, Inc., 2000
Business and Finance College Business and Finance College Principles of StatisticsPrinciples of Statistics
Lecture 11Lecture 11
aaed EL Rabai aaed EL Rabai week 11- 2010week 11- 2010
© The McGraw-Hill Companies, Inc., 2000
11-1611-1611-3 The Significance of the 11-3 The Significance of the
Correlation Coefficient Correlation Coefficient
The population correlation population correlation coefficientcoefficient, , is the correlation between all possible pairs of data values (x, y) taken from a population.
© The McGraw-Hill Companies, Inc., 2000
11-1711-1711-3 The Significance of the 11-3 The Significance of the
Correlation Coefficient Correlation Coefficient
H0: = 0 H1: 0 This tests for a significant
correlation between the variables in the population.
© The McGraw-Hill Companies, Inc., 2000
11-2211-22
The scatter plot for the age and blood pressure data displays a linear pattern.
We can model this relationship with a straight line.
This regression line is called the line of best fit or the regression line.
The equation of the line is y = a + bx.
11-4 Regression11-4 Regression
© The McGraw-Hill Companies, Inc., 2000
11-2311-2311-4 Formulas for the Regression 11-4 Formulas for the Regression
Line Line y = a + bx.
ay x x xy
n x x
bn xy x y
n x x
2
2 2
2 2
Where a is the y intercept and b is the slope of the line.
© The McGraw-Hill Companies, Inc., 2000
11-2411-24 11-411-4 Example
Find the equation of the regression line for the age and the blood pressure data.
Substituting into the formulas give a = 81.048 and b = 0.964 (verify).
Hence, y = 81.048 + 0.964x. Note, aa represents the interceptintercept and bb
the slopeslope of the line.
© The McGraw-Hill Companies, Inc., 2000
11-2511-25 11-411-4 Example
70605040
150
140
130
120
Age
Pre
ssur
e
70605040
150
140
130
120
Age
Pre
ssur
e
y = 81.048 + 0.964x
© The McGraw-Hill Companies, Inc., 2000
11-2611-2611-4 Using the Regression Line to11-4 Using the Regression Line to Predict Predict
The regression line can be used to predict a value for the dependent variable (y) for a given value of the independent variable (x).
Caution:Caution: Use x values within the experimental region when predicting y values.
© The McGraw-Hill Companies, Inc., 2000
11-2711-27 11-411-4 Example
Use the equation of the regression line to predict the blood pressure for a person who is 50 years old.
Since y = 81.048 + 0.964x, theny = 81.048 + 0.964(50) = 129.248 129.2
Note that the value of 50 is within the range of x values.
© The McGraw-Hill Companies, Inc., 2000
11-2811-2811-5 Coefficient of Determination 11-5 Coefficient of Determination and Standard Error of Estimateand Standard Error of Estimate
The coefficient of determinationcoefficient of determination, denoted by r2, is a measure of the variation of the dependent variable that is explained by the regression line and the independent variable.
© The McGraw-Hill Companies, Inc., 2000
11-2911-2911-5 Coefficient of Determination 11-5 Coefficient of Determination and Standard Error of Estimateand Standard Error of Estimate
r2 is the square of the correlation coefficient.
The coefficient of coefficient of nondeterminationnondetermination is (1 – r2).
Example: If r = 0.90, then r2 = 0.81.