aligning assessments to monitor growth in math achievement: a validity study jack b. monpas-huber,...

Aligning Assessments to Monitor Growth in Math Achievement: A Validity Study

Jack B. Monpas-Huber, Ph.D.Director of Assessment & Student Information

Washington State Assessment ConferenceDecember 2008

Core Themes1. Technical quality of locally developed

district assessments as outcome measure for program evaluation

• Growth modeling in HLM

2. Fidelity of local district instrumentation to state instrumentation

• Sampling same domain• Built from same maps and item specs• Measuring same thing?

-0.20 0.90 2.00 3.10 4.20

FRL = 0

FRL = 1

Possible Questions:

1. What does growth in math look like over the course of one year? Do the data match our expectations and theory of action?

2. Why do some students have steeper growth curves than others?

3. What causes growth? Does math coaching explain variation in the slopes of students’ growth curves?

Evaluating Program with Growth Curves

Background of this ProjectIssue #1:WASL math achievement seems to decline after 3rd grade

Working Explanations:

1. Reflection of curriculum and instruction?

2. Artifact of WASL standard setting

2007 WASL MATH ACHIEVEMENT

Gr3 Gr4 Gr5 Gr6 Gr7 Gr8 Gr10

Spokane State

Background of this ProjectIssue #2:District assessments ready for action

Written CurriculumSTANDARDS

Taught Curriculum

Tested CurriculumASSESSMENT of Learning

EVOLOVING PURPOSES (AND VALIDATION) OF ASSESSMENTS• Monitor learning across the system• Inferences about fidelity to curriculum, instructional

effectiveness• Curriculum development, professional development• Predictive of WASL performance

Design for Exploratory StudyWhat’s happening to our students from

Grade 5 to Grade 7?Longitudinal: Follow the same students from Grade 5 to Grade 7

Stratified: Approximately 10 students per classroom

Repeated measurements: What can we learn from these data sets?…..

– 2007 Grade 4 WASL– Fall SASL– Winter SASL– 5 End-of-unit district assessments– 2008 Grade 5 WASL

10 waves

Research QuestionsWhat can we validly infer from WASL-like

district benchmark assessments about….1. Fidelity to curriculum: Are teachers teaching

the district curriculum?2. Rigor of curriculum: Is it rigorous enough?

Are we preparing our kids adequately?3. Growth toward state standard: Are our district

assessments showing us true achievement of state standards?

Results: Performance on District Assessments

FALL SASL

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

UNIT 2

0 1 2 3 4 5 6 7 8 9 10

UNIT 4

0 1 2 3 4 5 6 7 8 9 10

UNIT 5

0 1 2 3 4 5 6 7

UNIT 3

0 1 2 3 4 5 6 7 8 9 10 11

UNIT 6

0 1 2 3 4 5 6 7 8 9 10

Winter SASL

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

2007 GRADE 5 WASL

5 9 13 17 21 25 29 33 37 41 45 49 53

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Research QuestionsFurther Questions about Growth toward State

Standard:• Are our district assessments showing us true achievement

of state standards?• Are our districts assessments aligned to WASL and

measuring the same thing?• How can I “link” scores from district assessments to WASL

so that they all sit on the same scale?• Could difficulty parameters for local items help teachers

see the relative difficulty of different skills?

Linking District Assessments to WASL

Equating, Scaling and Linking Literature• Linear Equating• Equipercentile Equating• Item Response Theory (IRT)

• Rasch Model (Winsteps) places item difficulties and person abilities on the same scale

• “Common person” single group design• Concurrent calibration

-0.20 0.90 2.00 3.10 4.20

FRL = 0

FRL = 1

Grade 4WASL

FallSASL

Unit 2

Unit 3

Unit 4

Random sample of 5 students

Results: Tentative Growth Curves

Two ChallengesReliability• A function of test length and redundancy

of items• District assessments fairly short• HLM raised doubts about reliability of

growth curves

Dimensionality• Rasch model assumes one dimension• Each district assessment measures

somewhat different content

Broader Issues for District Assessments

Locally developed WASL-like district benchmark assessmentsAdvantages: Local capacity building, content validityChallenge: Time and expense of development, technical

quality, scaling and equating

Outside Instruments, e.g., Measures of Academic Progress (MAP)Advantage: Technical quality, growth scaleChallenge: Content validity

aligning assessments to monitor growth in math achievement: a validity study jack b. monpas-huber,...

Documents

(huber 1) itec 4000 christianne huber winter 2016 selected...

huber katalog 2008

get the huber habit - calendar the huber - take one …

easycbm: benchmarking and progress monitoring system jack b....

jeff huber portfoilio

rti and district assessments jack b. monpas-huber, ph.d....

introduction to the new washington state achievement index...

predictive validity evidence for dibels: correlation to wasl...

ethan huber portfolio

jack b. monpas-huber, ph.d. director of assessment and...

lars zullig huber

jeff huber recommendation

huber solutions for mbr applications · huber solutions for...

gerry huber heritage research group gerry huber heritage...

wastewater technology innovation since 1961 · 2017. 11....

Шнековая решетка huber rotamat® ro9 ·...

mpc minichiller® mpc unichiller® - huber-online.com mpc...

jack b. monpas-huber, ph.d. director of assessment & student...

comparison of half- and full-day kindergarten on...

monetary puzzlement huber