ail apresentation(kumazawa)

16
Evaluating validity of criterion- referenced test score interpretations and uses Takaaki Kumazawa Kanto Gakuin University ([email protected]) Kintai Bridge, Japan (wiki

Upload: takakumazawa

Post on 07-Jul-2015

99 views

Category:

Data & Analytics


1 download

DESCRIPTION

I evaluated validity of criterion-referenced placement test score interpretations and uses using Kane’s (2006) argument-based validity framework

TRANSCRIPT

Page 1: Ail apresentation(kumazawa)

Evaluating validity of criterion-

referenced test score

interpretations and usesTakaaki Kumazawa

Kanto Gakuin University

([email protected])

Kintai Bridge, Japan (wiki)

Page 2: Ail apresentation(kumazawa)

Purpose

ß The purpose of my talk is to evaluate

validity of criterion-referenced placement

test score interpretations and uses using

Kane’s (2006) argument-based validity

framework

ß This presentation is based on a paper I

published in the JALT Journal

(http://jalt-publications.org/jj/issues/2013-05_35.1)

Page 3: Ail apresentation(kumazawa)

Classical view of validity

ß Validity: the extent to which a test is supposed to measure

ß Three types of validity

Þ Criterion-related validityCorrelation between a valid measure and a test developing

Þ Content validityExperts’ judgment on whether items are measuring what is supposed to measure

Þ Construct validityStatistical examination on whether items are measuring what is supposed to measure

Page 4: Ail apresentation(kumazawa)

Current view of Validity

ß Validity is “the degree to which evidence

and theory support the interpretations of

test scores entailed by proposed uses of

tests” (American Educational Research

Association, American Psychological

Association, & National Council on

Measurement in Education [AERA, APA, &

NCME], 1999, p. 9).

Page 5: Ail apresentation(kumazawa)

Argument-based validity framework

Interpretive argument: proving argument that the inferences are

going to make is theoretically valid

Validity argument: evaluating the interpretive argument by providing

warrant

Observatio

n

Observed

score

Universe

score

Target

scoreUse

Scoring generalization extrapolation

decision

Page 6: Ail apresentation(kumazawa)

Interpretive argument

ß Scoring inferenceÞ to what extent do examinees get placement items correct

and high-scoring examinees get more placement items correct

ß Generalization inference Þ to what extent are placement items consistently sampled

from a domain and sufficient in number so as to reduce the measurement error

ß Extrapolation inferenceÞ to what extent do the difficulty of placement items match to

the objectives of a reading course

ß Decision inferenceÞ to what extent do placement decisions made to place

examinees in their proper level of the course have an impact on washback in the course

Page 7: Ail apresentation(kumazawa)

Participants

Þ 428 Japanese 1st year university students majoring in law

Þ TOEIC score of about 250-450

Þ Three courses in the English program Reading

Listening

TOEIC skills

ß Proficiency based programÞ Three levels

Level 1: 60 high scoring studentsMajor objective of the reading course: improve their reading skills such as fast reading

Level 2: about 300 students

Level 3: 50 low scoring studentsMajor objective of the reading class: re-learn Jr High and High school grammar

Page 8: Ail apresentation(kumazawa)

Criterion-referenced placement test

ß Grammar (k = 40)

Þ Items are taken from textbooks used in junior and high schools

Þ Grammar: present, past, & future tenses, continuous, relative pronoun,

gerund, participle, etc…

Þ Sample: Hi, I ( ) Ken.

1. am 2. are 3. is 4. be

ß Vocabulary (k = 40)

Þ Items are taken from high frequent 1000-3000 words based on the

JACET 8000 corpus

Þ Sample: Bring

1. 送る (send) 2. 持ってくる (bring) 3. 鳴る (ring) 4. 購入する (buy)

ß Reading (k = 10)

Þ Two passages are taken from two textbooks used in Level 1 and Level

3 reading classes

Þ Sample: How do they travel?

1. by plane 2. by bus 3. by car 4. by train

Page 9: Ail apresentation(kumazawa)

Procedures

ß On the first day of semester, the placement test was given in 45 minutes

ß A grammar pretest (k = 55, α = .85) was given on the first day of class in Level 2 classes (n = 51) and Level 3 classes (n = 49)

ß 30 90-minute lessons in two semesters

ß The same grammar posttest (α = .92) was given on the last day of class to the same students (n = 51, 49)

ß A course evaluation survey was given to the same students (n = 51, 49)

Page 10: Ail apresentation(kumazawa)

Backing for scoring inference

ß Item facilityÞ 7 items below .29

Þ 62 items between .30 and .70

Þ 21 items above .71

ß Item discriminationÞ 4 items below .19

Þ 86 items above .20

ß Rasch Item difficulty estimatesÞ -3.79〜2.33

ß Infit MSÞ 0.80〜1.30

Page 11: Ail apresentation(kumazawa)

Backing for generalization inference

ß Multivariate generalizability theory

(Decision study of a persons X Items

design)

Þ Grammar (k = 40, ρ = .85, Φ = .83)

Þ Vocabulary (k = 40, ρ = .86, Φ = .84)

Þ Reading (k = 10, ρ = .58, Φ = .55)

Þ Total (k = 90, ρ = .92, Φ = .91)

Page 12: Ail apresentation(kumazawa)

Cut point for Level 1

Level 1 reading

Cut point for Level 3

Junior High grammar and 1000 word level

Backing for extrapolation inferenceDifficulty level estimates FACETS map

Level Difficulty SE Infit MS

Junior High grammar -0.65 0.03 1.00

High School grammar 0.29 0.02 1.00

1000 word level vocab -0.94 0.03 1.00

2000 word level vocab 0.15 0.03 1.00

3000 word level vocab 0.12 0.05 1.00

Level 3 rearing 0.30 0.05 1.00

Level 1 reading 0.73 0.05 1.10

-----------------------------------------------------

|Measr|+students

|-items | -levels

| CUT Po int for Leve ls 1, 2,

3

-----------------------------------------------------

+

3

+

+

+

+

|

|

.

| |

|

|

|

.

|

|

|

|

|

.

|

|

|

|

|

.

|

|

|

|

|

.

|

*

|

|

|

|

*

.

|

|

|

+

2

+

.

+

*

+

+

|

|

.

|

|

|

|

|

*

*

.

|

|

|

|

|

*

.

|

*

|

| Level 1a ( 1.49)

---------------------------------------------------------------------------

|

|

*

*

**.

|

|

|

|

|

*

*

**.

|

|

|

|

|

*

*

*.

|

*

|

|

+

1

+

*

**.

+

***

**

+

+

|

|

*

*

****

.

|

*

**

**

*

**

|

|

|

|

*

*

*.

|

***

|

Lev

el

1

Rea

d

ing

| L

e

vel 1b

(.77 )

---------------------------------------------------------------------------

|

|

*

*

****

.

|

*

****

*

|

|

|

|

*

*

**

|

****

**

|

|

|

|

*

*

****

*

. |

**

**

*

***

| Basic

H

S

G r a m

m a r |

|

|

*

*

**

|

****

****

|

JACET2000

J

ACET3000 |

*

0

*

*

****

*

*. *

***

*

** *

*

L e

v e l

2

( .

7 7-.70)

|

|

*

*

****

*

|

*

**

|

|

|

|

*

*

**.

|

***

***

|

|

|

|

*

*

****

.

|

*

***

|

|

|

|

*

*

****

*

** | ***

*

**

|

|

----------------------------------------------------------------------------

|

|

*

*

*

|

*

****

| Jr

H

Gram

m

a

r

| L

e

vel 3a

( -.70)

|

|

*

*

*.

|

**

|

|

----------------------------------------------------------------------------

+

-1 +

**

*

*.

+

**

+

J

AC

ET1

000

+ L e v el

3b

(

-.99)

|

|

*

*

.

|

*

*

|

|

|

|

.

|

*

|

|

|

|

.

|

|

|

|

|

.

|

|

|

|

|

|

*

|

|

|

|

.

|

*

|

|

+

-2 +

+

*

+

+

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

*

|

|

|

|

|

| |

+

-3 +

+

+

+

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

*

|

|

+

-4 +

+

+

+

-----------------------------------------------------

|Measr| *

=

4

|

*

=

1

| -levels

|

-----------------------------------------------------

Page 13: Ail apresentation(kumazawa)

Backing for decision inferenceLevel 2 and Level 3 students’ (n = 51, 49) grammar pretest and posttest

scores (k = 55)

11 points down

6 points up

Level 2

students

scored

higher

Level 3

students

scored

higher

Grammar pretest(α=.85) Grammar posttest(α=.92) Class Level n M SD n M SD Level 2a 26 30.38 6.34 21 12.14 2.50 Level 2b 25 32.36 8.47 24 28.63 7.93

Level 2 51 31.35 7.45 45 20.93 10.24 Level 3c 25 20.80 5.09 22 26.82 5.21 Level 3d 24 19.88 4.29 23 26.78 5.95 Level 3 49 20.35 4.69 45 26.80 5.53

Page 14: Ail apresentation(kumazawa)

Validity argumentInterpretive argumentß Scoring inference

Þ to what extent do examinees get placement items correct and high-scoring examinees get more placement items correct

ß Generalization inference

Þ to what extent are placement items consistently sampled from a domain and sufficient in number so as to reduce the measurement error

ß Extrapolation inference

Þ to what extent do the difficulty of placement items match to the objectives of a reading course

ß Decision inference

Þ to what extent do placement decisions made to place examinees in their proper level of the course have an impact on washback in the course

Validity argumentß Scoring inference

Þ Because most items were working well,

the inference from observation to the

observed score was valid

ß Generalization inference

Þ Because of high dependability with the

small amount of measurement error, the

inference from the observed score to

universe score was valid

ß Extrapolation inference

Þ Because the difficulty of the items were

adequate to the objectives of the program,

the inference from the universe score to

target score was valid

ß Decision inference

Þ Because Level 3 students were placed in

the right level and were able to improve

their grammar test scores, the inference

from the target score to test use was valid.

Page 15: Ail apresentation(kumazawa)

Conclusionß “Validation is simple in principle, but

difficult in practice. The argument-based

framework provides a relatively pragmatic

approach to validation” (Kane, 2012, p. 15).

William Jolly Bridge, Brisbane

(wiki)

Page 16: Ail apresentation(kumazawa)

References

ß Kane, M. (2006). Validation. In R. Brennan

(Ed.), Educational measurement (4th ed.). (pp.

17-64). Westport, CT: Greenwood Publishing.

ß Kane, M. (2012). Validating score

interpretations and uses. Language Testing,

29, 3-17. doi: 10.1177/0265532211417210

ß Kumazawa, T. (2013). Evaluating validity for

in-house placement test score interpretations

and uses. JALT Journal, 35, 73-100.