![Page 1: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/1.jpg)
Association
Predicting One Variable from Another
![Page 2: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/2.jpg)
Correlation
• Usually refers to Pearson’s r computed on two interval/ratio scale variables.
• It measures the degree to which variance in one variable is “explained” by a second variable
• It measures the strength of a linear relationship between the variables
![Page 3: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/3.jpg)
Definition of r
𝑟= σሺ𝑥𝑖 − 𝑥ҧሻ(𝑦𝑖 − 𝑦ത)σሺ𝑥𝑖 − 𝑥ҧሻ2 σ(𝑦𝑖 − 𝑦ത)2
![Page 4: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/4.jpg)
Properties of r
• r is symmetrical and varies from -1 to +1
• 0 indicates no correlation or relationship
• ±1 indicates a perfect correlation (knowledge of one variable makes it possible to predict the second one without any error).
![Page 5: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/5.jpg)
Properties of r2
• r2 is symmetrical and varies from 0 to 1
• r2 is the proportion of the variability in one variable that is “explained by” the other variable
• cor.test(x, y, method=“pearson”)• cor(x, y, method=“pearson”)
![Page 6: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/6.jpg)
![Page 7: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/7.jpg)
![Page 8: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/8.jpg)
![Page 9: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/9.jpg)
![Page 10: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/10.jpg)
![Page 11: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/11.jpg)
![Page 12: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/12.jpg)
Spearman’s rho• For rank/ordinal data. • Pearson correlation computed on
ranks• If Spearman coefficient is larger
than Pearson, it may indicate a non-linear relationship
• Ties make it difficult to compute p values
![Page 13: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/13.jpg)
Kendall’s tau
• For rank/ordinal data• Evaluate pairs of observations (xi,
yi) and (xj, yj)
• Concordant – (xi > xj) and (yi > yj) OR (xi < xj) and (yi < yj)
• Discordant – (xi > xj) and (yi < yj) OR (xi < xj) and (yi > yj)
![Page 14: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/14.jpg)
Kendall’s tau-a
𝜏𝑎 = ሺ𝑁𝑜.𝐶𝑜𝑛𝑐𝑜𝑟𝑑𝑎𝑛𝑡ሻ− (𝑁𝑜.𝐷𝑖𝑠𝑐𝑜𝑟𝑑𝑎𝑛𝑡)12𝑛(𝑛− 1)
![Page 15: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/15.jpg)
Kendall’s tau b
• Divide by total number of pairs adjusted for all ties
𝜏𝑏 = ሺ𝑁𝑜.𝐶𝑜𝑛𝑐𝑜𝑟𝑑𝑎𝑛𝑡ሻ− (𝑁𝑜.𝐷𝑖𝑠𝑐𝑜𝑟𝑑𝑎𝑛𝑡)ට൬𝑛(𝑛− 1)2 − σ𝑡𝑖(𝑡𝑖 − 1)2 ൰൬𝑛(𝑛− 1)2 − σ𝑢𝑖(𝑢𝑖 − 1)2 ൰
![Page 16: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/16.jpg)
Kendall’s tau c
• For grouped (tabled data) where the table is not square (rows ≠ columns)
𝜏𝑐 = ሺ𝑁𝑜.𝐶𝑜𝑛𝑐𝑜𝑟𝑑𝑎𝑛𝑡ሻ− (𝑁𝑜.𝐷𝑖𝑠𝑐𝑜𝑟𝑑𝑎𝑛𝑡)𝑁22 minሺ𝑟,𝑐ሻ− 1min(𝑟,𝑐) ൨
![Page 17: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/17.jpg)
Nominal Measures
• Measures based on Chi-Square:– Phi coefficient– Cramer’s V– Contingency coefficient– Odds ratio
![Page 18: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/18.jpg)
Phi and Cramer’s V
• Phi ranges from 0 to 1 in a 2x2 table but can exceed 1 in larger tables. Cramer’s V adds a correction to keep the maximum value at 1 or less:
𝜙 = ඨ𝜒2𝑁 𝑉= ඨ 𝜒2𝑁 × Min(𝑟− 1,𝑐− 1)
![Page 19: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/19.jpg)
Contingency Coefficient• Ranges from 0 to <1 depending on
the number of rows and columns with 1 indicating a high relationship and 0 indicating no relationship
𝐶= ඨ 𝜒2(𝜒2 + 𝑁)
![Page 20: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/20.jpg)
Odds Ratio
• For 2 x 2 tables it shows the relative odds between the two variables
a b
c d𝛼= 𝑎/𝑐𝑏/𝑑= 𝑎𝑑𝑏𝑐
![Page 21: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/21.jpg)
> Table <- xtabs(~Sex+Goods, data=EWG2)> Table GoodsSex Absent Present Female 38 28 Male 16 30> ChiSq <- chisq.test(Table)> ChiSq
Pearson's Chi-squared test with Yates' continuity correction
data: Table X-squared = 4.7644, df = 1, p-value = 0.02905
![Page 22: Association Predicting One Variable from Another](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2e5503460f94a06536/html5/thumbnails/22.jpg)
library(vcd)> assocstats(Table) X^2 df P(> X^2)Likelihood Ratio 5.7073 1 0.016894Pearson 5.6404 1 0.017552
Phi-Coefficient : 0.224 Contingency Coeff.: 0.219 Cramer's V : 0.224 > cor(as.numeric(EWG2$Sex), as.numeric(EWG2$Goods), use="complete.obs")[1] 0.2244111> oddsratio(Table, log=FALSE)[1] 2.544643