reliability chapter 3. classical test theory every observed score is a combination of true score...

Reliability

Chapter 3

Classical Test Theory

Every observed score is a combination of true score plus error.

Obs. = T + E

yreliabilitss

ss

O

T

O

E 2

2

2

2

1

Reliability Systematic versus unsystematic

error

Reliability only measures unsystematic error

Correlation

Correlation is a statistical technique that is often used in estimating reliability

Correlation coefficient: a numerical indicator of the relationship between two sets of data.

Positive Correlation

0

10

20

30

40

50

60

70

80

90

100

0 10 20 30 40 50 60 70 80 90 100

Series1

Negative Correlation

0

10

20

30

40

50

60

70

80

90

100

0 10 20 30 40 50 60 70 80 90 100

Series1

Pearson-Product MomentCorrelation

r

1z 2zN

Types of Reliability

Test-Retest

Alternate or Parallel Forms

Internal Consistency Measures

Internal Consistency Measures

Split-half reliability Spearman-Brown formula

Kuder-Richardson formulas KR 20 KR 21

Coefficient Alpha

Non-typical Situations

Speed tests

Criterion-referenced tests

Evaluating Reliability Coefficients

Examine purpose for using instrument

Have knowledge about the reliability coefficients of other instruments in area

Examine characteristics of particular clients against reliability coefficients

SES age culture/ethnicity

Standard Error of Measurement

rsSEM 1

Provides an estimation of the range of scores if someone were to take an instrument over and over again.

Based on the premise that when individuals take a test multiple times, the scores fall into a normal distribution.

Example of SEM

Sam’s SAT Verbal = 550 r = .91; s = 100

SEM =

68% of the time, Sam’s true score would fall between 520 and 580 95% of the time, Sam’s true score would fall between 490 and 610 99.5% of the time, Sam’s true score would fall between 460 and

640

100 1 .91

100 .09

100 .3 30

Using SEM to evaluate a score

Standard Error of Difference

A measure used by a counselor to examine the difference between two scores and determine if there is a significant difference.

Alternative Theoretical Model

Generalizability or Domain Sampling Theory

Focus is on estimating the extent to which specific sources of variation under defined conditions are contributing to the score on the instrument.

reliability chapter 3. classical test theory every observed score is a combination of true score...

Documents