reliability and validity what is measured and how well

30
Reliability and Reliability and Validity Validity what is measured and how what is measured and how well well

Upload: margaretmargaret-horton

Post on 28-Dec-2015

224 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Reliability and Validity what is measured and how well

Reliability and ValidityReliability and Validitywhat is measured and what is measured and

how wellhow well

Page 2: Reliability and Validity what is measured and how well

ReliabilityReliability

ConsistencyConsistency– Does the test agree with itself?Does the test agree with itself?

StabilityStability– Does the test agree with itself over time?Does the test agree with itself over time?

AgreementAgreement– Do different raters agree with each Do different raters agree with each

other?other?

Page 3: Reliability and Validity what is measured and how well

Consistency ReliabilityConsistency Reliability

Equivalent forms reliabilityEquivalent forms reliability– Correlation between the scores on Correlation between the scores on

two parallel forms of a testtwo parallel forms of a test

Internal consistency reliabilityInternal consistency reliability– Correlation between half sections of Correlation between half sections of

the test (Split Half), or between all the test (Split Half), or between all of the items (Internal Consistency)of the items (Internal Consistency)

Page 4: Reliability and Validity what is measured and how well

Stability ReliabilityStability Reliability

Test – Retest ReliabilityTest – Retest Reliability

The same test is given at two The same test is given at two different administrations to the different administrations to the same group of respondents.same group of respondents.

Correlation between time 1 and Correlation between time 1 and time 2.time 2.

Page 5: Reliability and Validity what is measured and how well

Agreement ReliabilityAgreement Reliability

Inter-Rater ReliabilityInter-Rater Reliability– Correlation between ratersCorrelation between raters– Correlation between rater and Correlation between rater and

expertexpert– % agreement between raters% agreement between raters– % agreement between rater and % agreement between rater and

expertexpert– Chance corrected methods (Kappa)Chance corrected methods (Kappa)– Variance partitioning methodsVariance partitioning methods

Page 6: Reliability and Validity what is measured and how well

The Radio Signal AnalogyThe Radio Signal Analogy

Signal to noise ratioSignal to noise ratio

Total Signal Received Total Signal Received = True signal + Noise = True signal + Noise

Signal / (Signal + Noise)Signal / (Signal + Noise)

Page 7: Reliability and Validity what is measured and how well

A Little Math About A Little Math About ReliabilityReliability

X = T + EX = T + E

Observed Score = True Score + Error Observed Score = True Score + Error

σσ22XX = σ = σ22

TT + σ + σ22EE

The spread of Observed scores = The spread of Observed scores = The spread is True scores + The spread is True scores + The spread in Error scores.The spread in Error scores.

Page 8: Reliability and Validity what is measured and how well

A Little Math About A Little Math About ReliabilityReliability

rrxx’xx’ = σ = σ22TT / σ / σ22

XX

rrxx’xx’ = 1 – (σ = 1 – (σ22EE / σ / σ22

XX))

Page 9: Reliability and Validity what is measured and how well

Reliability and PRI ScoresReliability and PRI ScoresTable 1Properties of the PRI scale and subscale scores

NumberScale Subscale of Items Alpha

Percieved Control Efficacy 6 0.661Mastery 6 0.800Persistance 5 0.708Total 17 0.897

Maintaining Perspective Maintaining a Flexible Perspective 5 0.736Maintaining Self-Direction 3 0.624Cognitive Restructuring of Perspective 3 0.591Knowing your Limits 4 0.572Total 15 0.873

Social Resourcefulness Reciprocity in Relationships 5 0.556Comfort in Relationships 2 0.602Feedback from Relationships 3 0.624Assistance in Relationships 4 0.579Total 14 0.822

Scanning Anticipation of Demands 6 0.602Recognition of Opportunities to Prevent Stress 4 0.633Planning Ahead 4 0.719Follow Through 4 0.592Total 18 0.861

Self-Acceptance Identify Comfort 6 0.647Accepting Limitations 4 0.610Balance 5 0.713Total 15 0.850

Preventive Resources 82 0.996

Note. n=344.

Page 10: Reliability and Validity what is measured and how well

ValidityValidity

Validity is the degree to which a Validity is the degree to which a test measures what it is intended test measures what it is intended to measure.to measure.

Validity is the meaningfulness, Validity is the meaningfulness, appropriateness, and usefulness appropriateness, and usefulness of the inferences made from the of the inferences made from the information a test provides.information a test provides.

Page 11: Reliability and Validity what is measured and how well

ValidityValidity

““Truth” and “Use”Truth” and “Use”

What is the test really measuring?What is the test really measuring?

For whom is the test appropriate?For whom is the test appropriate?

How should the information the test How should the information the test provides be used?provides be used?

Page 12: Reliability and Validity what is measured and how well

ConstructsConstructs

Assumptions we make when we use a Assumptions we make when we use a test:test:

The subject possesses some true The subject possesses some true amount of the latent theoretical amount of the latent theoretical construct that the test is designed to construct that the test is designed to measure.measure.

Depression, Coping, Math Aptitude, etc.Depression, Coping, Math Aptitude, etc.

Page 13: Reliability and Validity what is measured and how well

ConstructsConstructs

The amount of the construct the The amount of the construct the subject possesses is not directly subject possesses is not directly measurable.measurable.

Observable behaviors can represent Observable behaviors can represent the latent construct (ability, trait, the latent construct (ability, trait, etc.) and can be measured.etc.) and can be measured.

The goal is to measure as many of The goal is to measure as many of these observable behaviors as we these observable behaviors as we can and to measure them accurately.can and to measure them accurately.

Page 14: Reliability and Validity what is measured and how well

Types of ValidityTypes of Validity

Content ValidityContent Validity

Does the test cover all of the Does the test cover all of the intended content?intended content?

Measured by expert opinion.Measured by expert opinion.

Page 15: Reliability and Validity what is measured and how well

Types of ValidityTypes of Validity

Concurrent ValidityConcurrent Validity

Does the test agree with other Does the test agree with other existing measures of the same existing measures of the same construct?construct?

Correlations between the test scores Correlations between the test scores and scores from other measures.and scores from other measures.

Page 16: Reliability and Validity what is measured and how well

Types of ValidityTypes of Validity

Types of Concurrent Validity Types of Concurrent Validity EvidenceEvidence

Convergent ValidityConvergent Validity

Discriminant ValidityDiscriminant Validity

Page 17: Reliability and Validity what is measured and how well

Types of ValidityTypes of Validity

Known Groups ValidityKnown Groups Validity

Does the test distinguish Does the test distinguish between groups of subjects with between groups of subjects with known differences on the known differences on the construct or related constructs?construct or related constructs?

Page 18: Reliability and Validity what is measured and how well

Known Groups Validity and Known Groups Validity and the PRIthe PRI

Symptom-Free Substance Use

n=118 n=54

a b

a vs. b

Mean Mean Effect

PRI Scale SD SD t Size

Percieved Control 3.990 3.787 2.974** 0.494

0.410 0.425

Maintaining Perspective 4.038 3.795 3.576*** 0.590

0.412 0.418

Social Resourcefulness 3.996 3.823 2.742** 0.445

0.387 0.372

Self-Acceptance 4.069 3.783 4.369*** 0.771

0.370 0.453

Scanning 3.877 3.618 3.783*** 0.618

0.419 0.412

Preventive Resources 3.991 3.759 3.972*** 0.648

0.358 0.349

Note. *=p<.05,**=p<.01,***=p<.001.

Page 19: Reliability and Validity what is measured and how well

Known Groups Validity and Known Groups Validity and the PRIthe PRI

Symptom-Free Anxiety

n=118 n=37

a e

a vs. e

Mean Mean Effect

PRI Scale SD SD t Size

Percieved Control 3.990 3.733 2.440* 0.626

0.410 0.598

Maintaining Perspective 4.038 3.787 2.586* 0.609

0.412 0.542

Social Resourcefulness 3.996 3.812 2.423* 0.475

0.387 0.450

Self-Acceptance 4.069 3.742 3.308** 0.884

0.370 0.565

Scanning 3.877 3.675 2.059* 0.482

0.419 0.547

Preventive Resources 3.991 3.748 2.843** 0.679

0.358 0.480

Note. *=p<.05,**=p<.01,***=p<.001.

Page 20: Reliability and Validity what is measured and how well

Known Groups Validity and Known Groups Validity and the PRIthe PRI

Symptom-Free Depression

n=118 n=78

a f

a vs. f

Mean Mean Effect

PRI Scale SD SD t Size

Percieved Control 3.990 3.749 3.642*** 0.588

0.410 0.513

Maintaining Perspective 4.038 3.777 4.139*** 0.633

0.412 0.456

Social Resourcefulness 3.996 3.803 3.237*** 0.499

0.387 0.439

Self-Acceptance 4.069 3.727 5.155*** 0.924

0.370 0.503

Scanning 3.877 3.655 3.471*** 0.530

0.419 0.464

Preventive Resources 3.991 3.741 4.457*** 0.697

0.358 0.420

Note. *=p<.05,**=p<.01,***=p<.001.

Page 21: Reliability and Validity what is measured and how well

Consequential ValidityConsequential Validity

Is the test information useful for Is the test information useful for decision making?decision making?

Does it have any unintended Does it have any unintended consequences?consequences?

Can the information be misused?Can the information be misused?

Page 22: Reliability and Validity what is measured and how well

Predictive ValidityPredictive Validity

Can the test be used to predict Can the test be used to predict future behavior?future behavior?

Like Concurrent Validity (both are Like Concurrent Validity (both are Criterion Validity), but some time Criterion Validity), but some time passes between the test and the passes between the test and the criterion.criterion.

SAT and GPA.SAT and GPA.

Page 23: Reliability and Validity what is measured and how well

Construct ValidityConstruct Validity

All validity is really construct validity.All validity is really construct validity.

Does it measure what it is intended Does it measure what it is intended to measure?to measure?

Does the test agree with the theory Does the test agree with the theory in the field?in the field?

Does it reveal the true amount of the Does it reveal the true amount of the construct that a subject possesses?construct that a subject possesses?

Page 24: Reliability and Validity what is measured and how well

Other Related IssuesOther Related Issues

Tests should have Face Validity.Tests should have Face Validity.

Does the subject believe the test is Does the subject believe the test is measuring the intended construct?measuring the intended construct?

Some tests do not directly reveal Some tests do not directly reveal what is being measured.what is being measured.

Page 25: Reliability and Validity what is measured and how well

Other Related IssuesOther Related Issues

Reliability and validity are Reliability and validity are properties of the information that a properties of the information that a test provides, NOT of the test itself.test provides, NOT of the test itself.

The farther away you get from the The farther away you get from the original purpose for which a test original purpose for which a test was developed and validated, the was developed and validated, the weaker the inferences that can be weaker the inferences that can be made.made.

Page 26: Reliability and Validity what is measured and how well

Other Related IssuesOther Related Issues

No single indicator is sufficient No single indicator is sufficient for decision making. A battery of for decision making. A battery of indicators, or sources of indicators, or sources of information, is always better.information, is always better.

Reliability is a necessary Reliability is a necessary condition for the correct use of a condition for the correct use of a test, but not a sufficient one.test, but not a sufficient one.

Page 27: Reliability and Validity what is measured and how well

Other Related IssuesOther Related Issues

Validity is the most important Validity is the most important property of the information a test property of the information a test provides.provides.

Consistent information.Consistent information.Truthful information.Truthful information.Useful information.Useful information.

Page 28: Reliability and Validity what is measured and how well

The Credibility of a The Credibility of a WitnessWitness

ValidityHigh Low

Keeps story straight Keeps story straightHigh

Tells the truth Lies Reliability

Inconsistencies InconsistenciesLow

Tries to tell the truth Lies

Page 29: Reliability and Validity what is measured and how well

The Usefulness of a CarThe Usefulness of a Car

ValidityHigh Low

Starts and runs Starts and RunsHigh

Handles all conditions Needs good weather Reliability

Inconsistencies InconsistenciesLow

Handles all conditions Needs good weather

Page 30: Reliability and Validity what is measured and how well

Finding Lost Keys on a Dark Finding Lost Keys on a Dark StreetStreet

ValidityHigh Low

Bright Lights Bright LightsHigh

Shine on the Keys Lights wrong places Reliability

Flickers FlickersLow

Shine on the Keys Lights wrong places