aligning assessments to monitor growth in math achievement: a validity study jack b. monpas-huber,...

Post on 18-Jan-2018

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Possible Questions: 1.What does growth in math look like over the course of one year? Do the data match our expectations and theory of action? 2.Why do some students have steeper growth curves than others? 3.What causes growth? Does math coaching explain variation in the slopes of students’ growth curves? Evaluating Program with Growth Curves

TRANSCRIPT

Aligning Assessments to Monitor Growth in Math Achievement: A Validity Study

Jack B. Monpas-Huber, Ph.D.Director of Assessment & Student Information

Washington State Assessment ConferenceDecember 2008

Core Themes1. Technical quality of locally developed

district assessments as outcome measure for program evaluation

• Growth modeling in HLM

2. Fidelity of local district instrumentation to state instrumentation

• Sampling same domain• Built from same maps and item specs• Measuring same thing?

300.3

352.2

404.0

455.8

507.7

MA

TH

-0.20 0.90 2.00 3.10 4.20

TIME

FRL = 0

FRL = 1

Possible Questions:

1. What does growth in math look like over the course of one year? Do the data match our expectations and theory of action?

2. Why do some students have steeper growth curves than others?

3. What causes growth? Does math coaching explain variation in the slopes of students’ growth curves?

Evaluating Program with Growth Curves

Background of this ProjectIssue #1:WASL math achievement seems to decline after 3rd grade

Working Explanations:

1. Reflection of curriculum and instruction?

2. Artifact of WASL standard setting

2007 WASL MATH ACHIEVEMENT

0

20

40

60

80

100

Gr3 Gr4 Gr5 Gr6 Gr7 Gr8 Gr10

% M

eetin

g St

anda

rd

Spokane State

Background of this ProjectIssue #2:District assessments ready for action

Written CurriculumSTANDARDS

Taught Curriculum

Tested CurriculumASSESSMENT of Learning

EVOLOVING PURPOSES (AND VALIDATION) OF ASSESSMENTS• Monitor learning across the system• Inferences about fidelity to curriculum, instructional

effectiveness• Curriculum development, professional development• Predictive of WASL performance

Design for Exploratory StudyWhat’s happening to our students from

Grade 5 to Grade 7?Longitudinal: Follow the same students from Grade 5 to Grade 7

Stratified: Approximately 10 students per classroom

Repeated measurements: What can we learn from these data sets?…..

– 2007 Grade 4 WASL– Fall SASL– Winter SASL– 5 End-of-unit district assessments– 2008 Grade 5 WASL

10 waves

Research QuestionsWhat can we validly infer from WASL-like

district benchmark assessments about….1. Fidelity to curriculum: Are teachers teaching

the district curriculum?2. Rigor of curriculum: Is it rigorous enough?

Are we preparing our kids adequately?3. Growth toward state standard: Are our district

assessments showing us true achievement of state standards?

Results: Performance on District Assessments

FALL SASL

0

10

20

30

40

50

60

70

80

90

100

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Freq

uenc

y

UNIT 2

0

20

40

60

80

100

120

140

160

0 1 2 3 4 5 6 7 8 9 10

Freq

uenc

y

UNIT 4

0

20

40

60

80

100

120

140

160

0 1 2 3 4 5 6 7 8 9 10

Freq

uenc

y

UNIT 5

0

50

100

150

200

250

0 1 2 3 4 5 6 7

Freq

uenc

y

UNIT 3

0

20

40

60

80

100

120

140

160

0 1 2 3 4 5 6 7 8 9 10 11

Freq

uenc

y

UNIT 6

0

50

100

150

200

250

0 1 2 3 4 5 6 7 8 9 10

Freq

uenc

y

Winter SASL

0

10

20

30

40

50

60

70

80

90

100

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Freq

uenc

y

2007 GRADE 5 WASL

0

20

40

60

80

100

120

5 9 13 17 21 25 29 33 37 41 45 49 53

Freq

uenc

y

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Research QuestionsFurther Questions about Growth toward State

Standard:• Are our district assessments showing us true achievement

of state standards?• Are our districts assessments aligned to WASL and

measuring the same thing?• How can I “link” scores from district assessments to WASL

so that they all sit on the same scale?• Could difficulty parameters for local items help teachers

see the relative difficulty of different skills?

Linking District Assessments to WASL

Equating, Scaling and Linking Literature• Linear Equating• Equipercentile Equating• Item Response Theory (IRT)

• Rasch Model (Winsteps) places item difficulties and person abilities on the same scale

• “Common person” single group design• Concurrent calibration

300.3

352.2

404.0

455.8

507.7

MA

TH

-0.20 0.90 2.00 3.10 4.20

TIME

FRL = 0

FRL = 1

Grade 4WASL

FallSASL

Unit 2

Unit 3

Unit 4

Random sample of 5 students

Results: Tentative Growth Curves

Two ChallengesReliability• A function of test length and redundancy

of items• District assessments fairly short• HLM raised doubts about reliability of

growth curves

Dimensionality• Rasch model assumes one dimension• Each district assessment measures

somewhat different content

Broader Issues for District Assessments

Locally developed WASL-like district benchmark assessmentsAdvantages: Local capacity building, content validityChallenge: Time and expense of development, technical

quality, scaling and equating

Outside Instruments, e.g., Measures of Academic Progress (MAP)Advantage: Technical quality, growth scaleChallenge: Content validity

top related