r xy. when two variables are correlated, we can predict a score on one variable from a score on the...
Post on 19-Dec-2015
217 views
TRANSCRIPT
![Page 1: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/1.jpg)
rxy
![Page 2: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/2.jpg)
rxy
• When two variables are correlated, we can predict a score on one variable from a score on the other
• The stronger the correlation, the more accurate our prediction will be
![Page 3: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/3.jpg)
rxy
• We need a measure of the “strength” of a correlation
![Page 4: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/4.jpg)
rxy
• We need a number that gets bigger when big numbers are paired with big numbers and small numbers are paired with small numbers
• We need a number that gets smaller when big numbers are paired with small numbers and small numbers are paired with big numbers
![Page 5: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/5.jpg)
rxy
• Remember the height/weight example:• Big number indicates this (strong positive correlation)
5’ 5’2 5’4 5’6 5’8 5’10
100 110 120 130 140 150
a
a
b
b, e
c
c
d
d
e f
f
![Page 6: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/6.jpg)
rxy
• Remember the height/weight example:• Small number indicates this (strong negative
correlation)
5’ 5’2 5’4 5’6 5’8 5’10
100 110 120 130 140 150
a
a
b
b, e
c
c
d
d
ef
f
![Page 7: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/7.jpg)
rxy
• Two sets of scores, xi and yi
• What could we do?
![Page 8: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/8.jpg)
rxy
• What could we do?
€
(x iy i)i=1
n
∑
![Page 9: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/9.jpg)
rxy
• What could we do?• When pairs are multiplied and the
products are summed up: – Greatest when big numbers paired with big
numbers and small numbers with small numbers
– Least when small numbers are paired with big numbers and big numbers are paired with small numbers
![Page 10: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/10.jpg)
rxy
• analogy: This gets you most money
PenniesQuartersLoonies
![Page 11: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/11.jpg)
rxy
• analogy:this gets you the least…
PenniesQuartersLoonies
![Page 12: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/12.jpg)
rxy
• analogy:
Because:
3 x $1 plus 2 x $0.25 plus 1 x $0.01
is more than
1 x $1 plus 2 x $0.25 plus 3 x $0.01
![Page 13: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/13.jpg)
rxy
• But there’s a problem
€
(x iy i)i=1
n
∑Not a good measure because the value ultimately depends on n AND the size of the numbers
![Page 14: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/14.jpg)
rxy
• Try this
€
(x iy i)i=1
n
∑n
![Page 15: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/15.jpg)
rxy
• Try this
Still not so good - doesn’t depend on n anymore, but does depend on size of x’s and y’s
€
(x iy i)i=1
n
∑n
![Page 16: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/16.jpg)
rxy
• How about multiply deviation scores– comparing each variable relative to its
respective mean
€
(x i − x)(y i − y)i=1
n
∑n
![Page 17: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/17.jpg)
rxy
• Multiply deviation scores
Now value depends on the spread of the data
€
(x i − x)(y i − y)i=1
n
∑n
![Page 18: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/18.jpg)
rxy
• So standardize the scores
€
(x i − x)
Sx
(y i − y)
Syi=1
n
∑n
![Page 19: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/19.jpg)
rxy
• This measures strength of correlation:
€
(x i − x)
Sx
(y i − y)
Syi=1
n
∑n
=
€
(zx izyi)
i=1
n
∑n
= rxy
![Page 20: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/20.jpg)
rxy
• rxy ranges from -1.0 indicating a perfect negative correlation to +1.0 indicating a perfect positive correlation
• an rxy of zero indicates no correlation whatsoever. Scores are random with respect to each other.
![Page 21: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/21.jpg)
rxy
• rxy also has a geometric meaning
![Page 22: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/22.jpg)
rxy
• rxy also has a geometric meaning
• Recall that the mean of the zx and zy distributions is zero and each z-score is a deviation from the mean
![Page 23: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/23.jpg)
rxy
• Each point lands in one of four quadrants
point zx,zy
zx
zy
![Page 24: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/24.jpg)
rxy
• notice that:
both zx and zy are positive
€
(zx izyi)
i=1
n
∑n
rxy =
![Page 25: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/25.jpg)
rxy
• notice that:
zx is negative and zy is positive
€
(zx izyi)
i=1
n
∑n
rxy =
![Page 26: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/26.jpg)
rxy
• notice that:
zx is negative and zy is negative
€
(zx izyi)
i=1
n
∑n
rxy =
![Page 27: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/27.jpg)
rxy
• notice that:
zx is positive and zy is negative
€
(zx izyi)
i=1
n
∑n
rxy =
![Page 28: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/28.jpg)
rxy
• SoThus if most points tend to fall around a line with a positive (45 degree) slope (I and III), the cross-products will tend to be positive
III
III IV
![Page 29: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/29.jpg)
rxy
• So
If most points tend to fall around a line with a negative slope (II and IV), the cross products will tend to be negative
Thus if most points tend to fall around a line with a positive (45 degree) slope (I and III), the cross-products will tend to be positive
III
III IV
![Page 30: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/30.jpg)
rxy
• SoIf the points were randomly scattered about, the negative and positive cross-products cancel
![Page 31: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/31.jpg)
Covariance
• a related measure of the relationship between scores on two different variables is the covariance
€
Sxy =(x i − x )(y i − y )
i=1
n
∑n
![Page 32: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/32.jpg)
Covariance
• notice that the variance (S2x) is the
covariance between a variable and itself !
€
Sxy =(x i − x )(y i − y )
i=1
n
∑n
![Page 33: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/33.jpg)
Regression
• If two variables are perfectly correlated (r = + or - 1.0) then one can exactly predict a score on one variable given a score on another
![Page 34: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/34.jpg)
Regression
• For example: a university charges $250 registration fee plus $100 / credit
![Page 35: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/35.jpg)
Regression
• tuition = $100(X) + $250 – where X is the number of credits
• Notice this is a linear relationship (an equation of the form y = ax + b– a = $100/credit– b = $250– x = number of credits
![Page 36: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/36.jpg)
Regression
• Tuition as a function of credit hours is a straight line
• There is a perfect correlation between credit hours and tuition
•You could predict perfectly the tuition required given the number of credit hours
![Page 37: R xy. When two variables are correlated, we can predict a score on one variable from a score on the other The stronger the correlation, the more accurate](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d385503460f94a11f1f/html5/thumbnails/37.jpg)
Next Time
• Regression - read chapter 8