correlation 3rd
TRANSCRIPT
Correlation Correlation
Correlation Correlation
The aim is to investigate the linear The aim is to investigate the linear association between two continuous association between two continuous variables.variables.
Correlation:Correlation: Measures the closeness of the Measures the closeness of the
association.association.
CorrelationCorrelationSubjectSubject Body wt. Kg Body wt. Kg
(X)(X)XX22 Plasma volume Plasma volume
liter (Y) liter (Y) YY22 (XY)(XY)
11 5858 2.752.7522 7070 2.862.8633 7474 3.373.3744 6363 2.762.7655 6262 2.622.6266 70.570.5 3.493.4977 7171 3.053.0588 6666 3.123.12
∑ ∑ XX ∑∑XX22 ∑∑YY ∑∑YY22 ∑∑XYXY
Correlation, cont.Correlation, cont.
In general, high plasma volume tend to be In general, high plasma volume tend to be associated with high weight. This association is associated with high weight. This association is measured by Pearson correlation coefficient (r) measured by Pearson correlation coefficient (r)
Correlation coefficient (r)Correlation coefficient (r)
r = r =
∑ ∑∑∑
∑ ∑ ∑
−−
−
n
yy
n
xx
n
yxxy
22
22 )(
)()(
(
))((
Scatter diagram of plasma volume and body weight showing linear regression line
y = 0.0434x + 0.0998
R2 = 0.5818
0
1
2
3
4
45 55 65 75 85
Body weight (Kg)Plas
ma
volu
me
(lite
r)
ResultsResults
N = 8N = 8∑∑x = 535 Mean x = 66.875 x = 535 Mean x = 66.875 ∑∑xx2 2 == 35983.535983.5∑∑y = 24.02 Mean y = 3.0025y = 24.02 Mean y = 3.0025∑∑yy22 = 72.798 = 72.798 ∑∑xy = 1615.295xy = 1615.295
r = r = 76.0)6780.0)(38.205(
96.8 =
Significance testSignificance test
A t test is used to test A t test is used to test whether (r) is whether (r) is significantly differ significantly differ from zero ie whether from zero ie whether the observed the observed correlation could correlation could simply due to chance.simply due to chance.
t = t = 21
2
r
nr
−−
Significance test, cont.Significance test, cont.
d.f. = n-2d.f. = n-2
t = t =
d.f. = 6d.f. = 6P < 0.05P < 0.05
86.2)76.0(1
676.0
2=
−
Significance test, cont.Significance test, cont.
Note: A significance level is a function of Note: A significance level is a function of both the size of the correlation coefficient both the size of the correlation coefficient and number of observations.and number of observations.
A weak correlation may therefore be A weak correlation may therefore be statistically significant if based on a large statistically significant if based on a large number of observations, while a strong number of observations, while a strong correlation may fail to achieve significance correlation may fail to achieve significance if there are only a few observations. if there are only a few observations.
Interpretation of r Interpretation of r
r is always a No. between minus 1 and r is always a No. between minus 1 and plus 1.plus 1.
r is positive if x and y tend to be high or r is positive if x and y tend to be high or low together, and the larger its value, the low together, and the larger its value, the closer the association.closer the association.
r is negative if high value of y tend to go r is negative if high value of y tend to go with low values of x and vice versa.with low values of x and vice versa.
r valuer value InterpretationInterpretation
ZeroZero No correlationNo correlation
0<r<10<r<1 Imperfect (direct) positive Imperfect (direct) positive correlationcorrelation
11 Perfect (direct) positive Perfect (direct) positive correlationcorrelation
-1 < r < 0-1 < r < 0 Imperfect negative Imperfect negative (inverse) correlation(inverse) correlation
r = -1r = -1 Perfect negative (inverse) Perfect negative (inverse) correlationcorrelation
Guideline values for interpretation Guideline values for interpretation the value of (r) the value of (r)
rr Degree of Degree of associationassociation
± 1± 1 PerfectPerfect
± 0.7 - ± 1± 0.7 - ± 1 StrongStrong
± 0.4 - ± 0.7± 0.4 - ± 0.7 ModerateModerate
< 0.4< 0.4 WeakWeak
0.00.0 No associationNo association
NotesNotes
r measures only linear relationship. r measures only linear relationship. When there is a strong non-linear When there is a strong non-linear
relationship, r = zero.relationship, r = zero. So we have to draw a scatter diagram first So we have to draw a scatter diagram first
to identify non-linear relationship.to identify non-linear relationship. If you calculate r without examining the If you calculate r without examining the
data, then you will miss a strong but non data, then you will miss a strong but non linear relationship.linear relationship.
The coefficient of determination (rThe coefficient of determination (r22))
When r = 0.58, then r2 = 0.34.When r = 0.58, then r2 = 0.34.This means that 34% of the variation in the This means that 34% of the variation in the
values of (y) may be accounted for by values of (y) may be accounted for by knowing values of (X) or vice versa.knowing values of (X) or vice versa.