1 how to model and test for the mechanisms that make measurement systems tick imeko jena, germany...

32
1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A. Jackson Stenner Chairman & CEO, MetaMetrics [email protected]

Upload: aron-wiggins

Post on 17-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

1

How to Model and Test for the Mechanisms that make Measurement Systems Tick

IMEKOJena, Germany

Wednesday, August 31, 2011

A. Jackson Stenner

Chairman & CEO, [email protected]

Page 2: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

2

Reader Ability

Temperature

Page 3: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

3

Four well researched constructs

Reader ability

Text Complexity

Task Difficulty

Comprehension

Page 4: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

4

Reading is a process in which information from the text and the

knowledge possessed by the reader act together to

produce meaning.

Anderson, R.C., Hiebert, E.H., Scott, J.A., & Wilkinson, I.A.G. (1985) Becoming a nation of readers: The report of the Commission on ReadingUrbana, IL: University of Illinois

Page 5: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

5

Reading is a process in which information from the text [Complexity]

and the knowledge possessed by the reader [Ability]act together to

produce meaning [Comprehension] as indexed by a specific task type

[Difficulty].

Page 6: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

6

An Equation

=Reader Ability

Text ComplexityComprehension -

Conceptual

Statistical

RawScore

=i

e (RA – TC - TD)i

1 + e (RA – TC i - TD)

RA = Reading Ability

TC = Text Calibrations

TD = Task Difficulty

- TaskDifficulty

Page 7: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

7

A causal model relating reader ability, text complexity, task difficulty and comprehension

Measures reader ability, text complexity, and task difficulty on a common scale—the Lexile scale

Allows educators to forecast the level of success a reader is likely to experience with a particular text when responding to a specific task requirement

The Lexile Framework for Reading

Page 8: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

8

Eight Features of the Causal Model Relating Text Complexity, Reader Ability, Task Difficulty and Comprehension1. The model is individual centered. The focus is on explaining

variation within person over time.

2. In this framework the measurement mechanism is well specified and can be manipulated to produce predictable changes in measurement outcomes (e.g. percent correct).

3. Item parameters are supplied by substantive theory and, thus, person parameter estimates are generated without reference to or use of any data on other persons or populations. Therefore, effects of the examinee population have been completely eliminated from consideration in the estimation of person parameters for reader ability.

Page 9: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

9

4. The quantitivity hypothesis can be experimentally tested by evaluating the trade-off property for the individual case. A change in the person parameter can be off-set or traded-off for a compensating change in text complexity to hold comprehension constant. The trade-off is not just about the algebra.

5. When uncertainty in item difficulties is too large to ignore, individual item difficulties may be a poor choice to use as calibration parameters in causal models. As an alternative we recommend, when feasible, averaging over individual item difficulties to produce “ensemble” means. For example empirical text complexities can be excellent dependent variables for testing causal theories.

Eight Features of the Causal Model cont’d.

Page 10: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

10

6. Causal Rasch models are individual centered and are explanatory at both within-subject and between-subject levels. The attribute on which I differ from myself a decade ago is the same attribute on which I differ from my brother today.

8. When data fit a Rasch model, differences between person measures are objective. When data fit a causal Rasch model absolute person measures (reader abilities) are objective (i.e. independent of instrument).

9. Causal Rasch models make possible the construction of generally objective growth trajectories. Each trajectory can be completely separated from the instruments used in its construction and from the performance of any other persons, whatsoever.

Eight Features of the Causal Model cont’d.

Page 11: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

11

Let’s examine in detail each of the four constructs that make up the Lexile Framework for Reading

Text Complexity Reader Ability Task Difficulty Comprehension

Page 12: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

12

May 2016(12th Grade)

1200

1000

1400

1600

Text Demands forCollege and Career

May 2007 – April 2011

347 Encounters138,695 Words3,342 Items983 Minutes

Student 1528

7th GradeMaleHispanicPaid Lunch

Expected: 73.5%Observed: 71.7%

Page 13: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

13

Page 14: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

14

Mythology Text Complexity Theoretical: 1300L Empirical:

1357L Adapted from Oasis Article courtesy of EBSCO Publishing

The study and interpretation of myth and the body of myths of a particular culture. Myth is a complex cultural phenomenon that can be approached from a number of viewpoints. As generally understood, a myth is a story or narrative that is traditional in a certain culture, having been passed down from early times and regarded as true. It may be said to 1 symbolically the origin of the basic elements and assumptions of a culture. Mythic narratives frequently revolve around the doings of gods or heroes, and may relate, for example, how the world began, how humans and animals came into being, or how certain customs, gestures, or forms of human activities 2. Almost all cultures possess or at one time possessed and lived in terms of myths.

1 immerse belittle portray contradict

2 originated adorned handicapped entwined

Page 15: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

15

r = 0.952

r” = 0.960

R2” = 0.921

RMSE” = 99.8L

Figure 1: Plot of Theoretical Text Complexity versus Empirical Text Complexity for 475 articles

“Mythology”

Reliability = .996SEM = 12L

Page 16: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

16

What could account for the 8% unexplained variance?

Missing Variables or Theory misspecification Better Criterion Variable – task specificity

hypothesis Improved Proxies/Operationalizations Expanded Error Model – Treat Item Type as

Random Rounding Error Imperfections in Theory Implementation

Page 17: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

17

Page 18: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

18

Connect reading to postsecondary demands—

www.lexile.com/toefl

Page 19: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

19

Examine growth overtime—

Page 20: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

20

Task Plane

Different task types may vary in average difficulty and in unit size. Adjustments for added easiness or hardness and adjustments for unit size (Humphry, 2011) can be made to bring new task types into the Lexile Frame of Reference.

Page 21: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

21

Task Plane

0 Added HardnessAdded Easiness

Reference Value(Native Item)

Unit Size

Page 22: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

22

How Temperature and Pressure Relate Under Constant Volume

Temperature Volume Pressure

2000°K 20 Liters 20.0 atm

1000°K 20 Liters 10.0 atm

500°K 20 Liters 5.0 atm

250°K 20 Liters 2.5 atm

125°K 20 Liters 1.25 atm

Page 23: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

23

Comprehension Rates for Readers of the Same Ability with Texts of Different Complexity or How Reader Ability and Comprehension Rate Relate Under Varying Text Complexity

ReaderAbility

TextComplexit

yText Titles

Comprehension Rates

1000L

1000L

1000L

1000L

1000L

500L

750L

1000L

1250L

1500L

The Magic School Bus, Inside the

Earth (Cole)

The Martian Chronicles (Bradbury)

The Reader’s Digest

The Call of the Wild (London)

On Equality Among Mankind

(Rousseau)

96%

90%

75%

50%

25%

Comprehension Rates for Fixed Reader Ability

Page 24: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

24

Comprehension Rates for Readers of Different Ability with Texts of the Same Complexity or How Reader Ability and Comprehension Rate Relate Under Constant Text Complexity

Reader Ability Classroom Textbook Comprehension Rates

500L750L

1000L1250L1500L

1000L1000L1000L1000L1000L

25%50%75%90%96%

Comprehension Rates for Fixed Text Complexity

Page 25: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

25

Testing the Lexile Theory

How closely does Observed Comprehension (success rate) correspond to what the Lexile theory predicts?

Page 26: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

26

Page 27: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

27

Oasis: Usage Report by Reader Lexile

Page 28: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

28

Page 29: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

29

Oasis: Usage Report By Category of Article

Page 30: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

30

To causally explain a phenomenon [a measurement outcome] is to provide information about the factors [person processes and instrument mechanisms] on which it depends and to exhibit how it depends on those factors. This is exactly what the provision of counterfactual information…accomplishes: we see what factors some explanandum M [measurement outcome, raw score] depends on (and how it depends on those factors) when we have identified one or more variables such that changes in these (when produced by interventions) are associated with changes in M (Woodward, 2003, p.204).

Page 31: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

31

How Many Ways Can We Say X Causes Y?

X “elicited a greater” Y

X “impacts” Y

X “accounts for” Y X “has been linked to” Y

Y “is the result of” X X “didn’t diminish” Y

Y “because of” X Y “depends on” X

X “has led to” Y X “largely motivates” Y

Y “stemmed from” X X “proved critical to” Y

X “fosters” Y X “changes” Y

X “triggers” Y X “affects” Y

Page 32: 1 How to Model and Test for the Mechanisms that make Measurement Systems Tick IMEKO Jena, Germany Wednesday, August 31, 2011 A.Jackson Stenner Chairman

32

A. Jackson Stenner CEO, [email protected]

Contact Info: