introduction to correlation. correlation the news is filled with examples of correlation miles...
TRANSCRIPT
Introduction to Correlation
Correlation
The news is filled with examples of correlation
Miles flown in an airplane vs … Driving faster than the speed limit vs … Women who smoke during pregnancy… If you eat only fast food for 30 days…
How Do You Calculate Correlation in Excel?
Make an XY scatterplot of the data, putting one variable on the x-axis and one variable on the y-axis.
Insert a linear trendline on the graph and include the R2 value
Interpret the results
Interpreting the Results
• The higher the R2 value, the more influence the variables might have on each other
• If you only have a few data points, then you need a higher R2 value in order to conclude there is a correlation
• Crude estimate: R2 > 0.5, most people say there is a correlation; R2 < 0.3, the correlation is essentially non-existent
• R2 between 0.3 and 0.5?? Gray area!
Examples
Look at: CigarettesBirthweight.xls SpeedLimits.xls HeightWeight.xls Grades.xls WineConsumption.xls BreastCancerTemperature.xls
How Do We Calculate Correlation in SPSS? In SPSS, click on Analyze ->
Correlate -> Bivariate (two variables) Select the two columns of data you
want to analyze (move them from the left box to the right box)
You can actually pick more than two columns, but stick with bivariate for now
How Do We Calculate Correlation in SPSS? Make sure the checkbox for Pearson
Correlation Coefficients is checked Click OK to run the correlation You should get an output window
something like the following slide
The correlation betweenheight and weight is 0.861
The Pearson Correlation value is not the sameas Excel’s R-squared value; Pearson’s valuecan be positive or negative
Positive and Negative Correlation (Pearson’s) Positive correlation: as the values of
one variable increase, the values of a second variable increase (values from 0 to 1.0)
Negative correlation: as the values of one variable increase, the values of a second variable decrease (values from 0 to -1.0)
Interpreting Pearson’s Correlation
Positive and Negative Correlation There is a negative correlation
between TV viewing and class grades—students who spend more time watching TV tend to have lower grades (or, students with higher grades tend to spend less time watching TV).
Positive and Negative Correlation
Positive correlation Negative correlation
Positive and Negative Correlation When looking for correlation, a
positive correlation is not necessarily greater than a negative correlation
Which correlation (Pearson) is the greatest?
- .34 .72 - .81 .40 - .12
Notice: R2 is same here
You can’t always tell relationships just by looking at the graph.
What Can We Conclude?
If two variables are correlated, then we can predict one based on the other
But correlation does NOT imply cause! It might be the case that having more
education causes a person to earn a higher income. It might be the case that having higher income allows a person to go to school more. There could also be a third variable. Or a fourth. Or a fifth…
What Can We Conclude?
Causality – one variable, say A, actually causes the change in B
Sheer coincidence – A and B really do not have anything to do with each other but happen to go up or down simultaneously
What Can We Conclude?
Common underlying cause or causes – most important one – A is correlated to B, but there is a third factor C (the common underlying cause) that causes the changes in both A and B.
Example: as ice cream sales go up, so do crime rates.