bivariate linear correlation

Post on 09-Feb-2016

38 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Bivariate Linear Correlation. Linear Function. Y = a + bX. Fixed and Random Variables. A FIXED variable is one for which you have every possible value of interest in your sample. Example: Subject sex, female or male. - PowerPoint PPT Presentation

TRANSCRIPT

Bivariate Linear Correlation

Linear Function

•Y = a + bX

Fixed and Random Variables

• A FIXED variable is one for which you have every possible value of interest in your sample.– Example: Subject sex, female or male.

• A RANDOM variable is one where the sample values are randomly obtained from the population of values.– Example: Height of subject.

Correlation & Regression• If Y is random and X is fixed, the model is

a regression model.• If both Y and X are random, the model is a

correlation model.• Psychologists generally do not know this• They think

– Correlation = compute the corr coeff, r– Regression = find an equation to predict Y

from X

Scatter PlotPerfect Positive Linear

X

Y

Perfect Negative Linear

X

Y

Perfect Positive Monotonic

X

Y

Perfect Negative Monotonic

X

Y

Nonmonotonic Relationship

Test Anxiety

Perf

orm

ance

For the data plotted below, the linear r = 0, but the quadratic r = 1.

Burgers (X) and Beer (Y)

Subject X Y XY 1 5 8 40 2 4 10 40 3 3 4 12 4 2 6 12 5 1 2 2

Sum 15 30 106 Mean 3 6 St. Dev. 1.581 3.162

A Scatter Plot of Our Data

Burgers

Bee

rs

Burger (X)-Beer (Y) CorrelationSubject X Y XY

1 5 8 40 2 4 10 40 3 3 4 12 4 2 6 12 5 1 2 2

Sum 15 30 106 Mean 3 6 St. Dev. 1.581 3.162

N

YXXYYYXXSSCP

))(())((

.

NYYYYYYYYSSy

))(())((

165

)30(15106

))(( YYXXSSCP

))(( YYXXSSCP

))(( YYXXSSCP

Burger (X)-Beer (Y) CorrelationSubject X Y XY

1 5 8 40 2 4 10 40 3 3 4 12 4 2 6 12 5 1 2 2

Sum 15 30 106 Mean 3 6 St. Dev. 1.581 3.162

165

)30(15106))((

))(( N

YXXYYYXXSSCP

.

.44

161

N

SSCPCOV 80.)162.3(581.1

4),(

yxssYXCOVr

Hø: ρ = 0

• df = n – 2 = 3• Now get an exact p value and construct a

confidence interval

309.264.138.

1

22

r

nrt

Get Exact p Value

• COMPUTE p=2*CDF.T(t,df).

Go To Vassar

• http://vassarstats.net/

N increased to 10.

Presenting the Results• The correlation between my friends’

burger consumption and their beer consumption fell short of statistical significance, r(n = 5) = .8, p = .10,95% CI [-.28, .99].

• Among my friends, beer consumption was positively, significantly related to burger consumption, r(n = 10) = .8, p = .006,95% CI [.34, .95].

Assumptions

1. Homoscedasticity across Y|X2. Normality of Y|X3. Normality of Y ignoring X 4. Homoscedasticity across X|Y5. Normality of X|Y6. Normality of X ignoring Y• The first three should look familiar, we

made them with the pooled variances t.

Bivariate Normal

When Do Assumptions Apply?

• Only when employing t or F.• That is, obtaining a p value• or constructing a confidence interval.

Shrunken r2

• This reduces the bias in estimation of • As sample size increases (n-1)/(n-2)

approaches 1, and the amount of correction is reduced.

52.3

)4)(64.1(1)2(

)1)(1(12

nnr

Do not use Pearson r if the relationship is not linear. If it is monotonic, use Spearman rho.

Every time X increases, Y decreases – accordingly we

have here a perfect, negative, monotonic relationship

Pearson r measures the strength of the linear relationship. Notice that it is NOT perfect here.

Spearman rho measures the strength of monotonic relationship. Notice that it IS perfect here.

Uses of Correlation Analysis

• Measure the degree of linear association• Correlation does imply causation

– Necessary but not sufficient– Third variable problems

• Reliability• Validity• Independent Samples t – point biserial r

– Y = a + b Group (Group is 0 or 1)

Uses of Correlation Analysis

• Contingency tables -- Rows = a + bColumns

• Multiple correlation/regression

pp XbXbXbaY 2211

HighSchoolpMathVerbalECU GPAbSATbSATbaGPA 21

Uses of Correlation Analysis• Analysis of variance (ANOVA)

• PolitConserv = a + b1 Republican? + b2 Democrat?k = 3, the third group is all others

• Canonical correlation/regression

11?22?11 kk GroupbGroupbGroupbaY

)()( 22112211 YbYbXaXa

Uses of Correlation Analysis• Canonical correlation/regression

• (homophobia, homo-aggression) = (psychopathic deviance, masculinity, hypomania, clinical defensiveness)

• High homonegativity = hypomanic, unusually frank, stereotypically masculine, psychopathically deviant (antisocial)

)()( 22112211 YbYbXaXa

Factors Affecting Size of r• Range restrictions

– Without variance there can’t be covariance• Extraneous variance

– The more things affecting Y (other then X), the smaller the r.

• Interactions – the relationship between X and Y is modified by Z– If not included in the model, reduces the r.

Power Analysis

1 n

Cohen’s Guidelines

• .10 – small but not trivial• .30 – medium• .50 – large

PSYC 6430 Addendum

• The remaining slides cover material I do not typically cover in the undergraduate course.

Correcting for Measurement Error

• If reliability is not 1, the r will underestimate the correlation between the latent variables.

• We can estimate the correlation between the true scores this way:

• rxx and rYY are reliabilities

yyXX

XYYX rr

rrtt

Example

• r between misanthropy and support for animal rights = .36 among persons with an idealistic ethical ideology

.42.)93(.78.

36.

ttYXr

Comparing Correlation/Regression

Coefficients

• Weaver, B., & Wuensch, K. L.  (2013).  SPSS and SAS programs for comparing Pearson correlations and OLS regression coefficients.  Behavior Research Methods, 45, 880-895.  doi 10.3758/s13428-012-0289-7

H: 1 = 2• Is the correlation between X and Y the

same in one population as in another?• The correlation between misanthropy and

support for animal rights was significantly greater in nonidealists (r = .36) than in idealists (r = .02)

H: WX = WY• We have data on three variables. Does

the correlation between X and W differ from that between Y and W.

• W is GPA, X is SATverbal, Y is SATmath.• See Williams’ procedure in our text.• See other procedures referenced in my

handout.

H: WX = YZ • Raghunathan, T. E, Rosenthal, R, & and

Rubin, D. B. (1996). Comparing correlated but nonoverlapping correlations, Psychological Methods, 1, 178-183.

• Example: is the correlation between verbal aptitiude and math aptitude the same at 10 years of age as at twenty years of age (longitudinal data)

H: = nonzero value• A meta-analysis shows that the correlation

between X and Y averages .39.• You suspect it is not .39 in the population

in which you are interested.• H: = .39.

top related