the correlation coefficient. social security numbers

16
The Correlation Coefficient

Upload: rodger-hilary-robbins

Post on 31-Dec-2015

235 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: The Correlation Coefficient. Social Security Numbers

The Correlation Coefficient

Page 2: The Correlation Coefficient. Social Security Numbers

Social Security Numbers

Page 3: The Correlation Coefficient. Social Security Numbers

A Scatter Diagram

Page 4: The Correlation Coefficient. Social Security Numbers

The Point of Averages

• Where is the center of the cloud?

• Take the average of the x-values and the average of the y-values; this is the point of averages.

• It locates the center of the cloud.

• Similarly, take the SD of the x-values and the SD of the y-values.

Page 5: The Correlation Coefficient. Social Security Numbers

Examples

Page 6: The Correlation Coefficient. Social Security Numbers

The Correlation Coefficient

• An association can be stronger or weaker.

• Remember: a strong association means that knowing one variable helps to predict the other variable to a large extend.

• The correlation coefficient is a numerical value expressing the strength of the association.

Page 7: The Correlation Coefficient. Social Security Numbers

The Correlation Coefficient

• We denote the correlation coefficient by r.

• If r = 0, the cloud is completely formless; there is no correlation between the variables.

• If r = 1, all the points lie exactly on a line (not necessarily x = y) and there is perfect correlation.

Page 8: The Correlation Coefficient. Social Security Numbers

Strong and Weak

Page 9: The Correlation Coefficient. Social Security Numbers

The Correlation Coefficient

• What about negative values?

• The correlation coefficient is between –1 and 1, negative shows negative association, positive indicates positive association.

• Note that –0.90 shows the same degree of association as +0.90, only negative instead of positive.

Page 10: The Correlation Coefficient. Social Security Numbers
Page 11: The Correlation Coefficient. Social Security Numbers

Computing the Correlation Coefficient

1. Convert each variable to standard units.

2. The average of the products gives the correlation coefficient r.

r = average of

(x in standard units) (y in standard units)

Page 12: The Correlation Coefficient. Social Security Numbers

Example

x y

1 5

3 9

4 7

5 1

7 13

We must first convert to standard units.

Find the average and the SD of the x-values: average = 4, SD = 2.

Find the deviation: subtract the average from each value, and divide by the SD.

Then do the same for the y-values.

Page 13: The Correlation Coefficient. Social Security Numbers

ExampleStandard units

x y x y x y

1 5 -1.5 -0.5 0.75

3 9 -0.5 0.5 -0.25

4 7 0.0 0.0 0.00

5 1 0.5 -1.5 -0.75

7 13 1.5 1.5 2.25

Page 14: The Correlation Coefficient. Social Security Numbers

Example

• Finally, take the average of the products

• In this example, r = 0.40.

r = average of

(x in standard units) (y in standard units)

Page 15: The Correlation Coefficient. Social Security Numbers

The SD line

• If there is some association, the points in the scatter diagram cluster around a line. But around which line?

• Generally, this is the SD line. It is the line through the point of averages.

• It climbs at the rate of one vertical SD for each horizontal SD.

• Its slope is (SD of y) / (SD of x) in case of a positive correlation, and –(SD of y) / (SD of x) in case of a negative correlation.

Page 16: The Correlation Coefficient. Social Security Numbers

Five-point Summary

• Remember the five-point summary of a data set: minimum, lower quartile, median, upper quartile, and maximum.

• A five-point summary for a scatter plot is: average x-values, SD x-values, average y-values, SD y-values, and correlation coefficient r.