colorado assessment summit_teacher_eval

52
Presenter - John Cronin, Ph.D. Contacting us: NWEA Main Number: 503-624-1951 E-mail: [email protected] This PowerPoint presentation and recommended resources are available at our our Slideshare site http://www.slideshare.net/JohnCronin4/colorado-assessment- summitteachereval Considerations when using tests for teacher evaluation

Upload: john-cronin

Post on 03-Jul-2015

48 views

Category:

Education


0 download

DESCRIPTION

John Cronin's presentation

TRANSCRIPT

Page 2: Colorado assessment summit_teacher_eval

Key Colorado requirements related to testing

• Assessment constitutes 50% of the evaluation.

• Statewide summative assessments for subjects in which available. Districts will be on their own for other subjects.

• Use of the Colorado Growth Model with statewide assessment.

• A measure of individually attributed or collectively attributed student growth.

• Local measure must be credible, valid (aligned), reliable, and inferences from the measure must be supportable by evidence and logic.

• The law requires that the measures should support consistent inferences.

• Rating of ineffective or partially effective can lead to loss of non-probationary status.

• If a value-added model is used the model must be transparent enough to permit external evaluation.

Page 3: Colorado assessment summit_teacher_eval

Unique characteristics of the Colorado approach

• Student progress counts for 50% of the evaluation.

• Teachers are evaluated on both a “catch up” and “keep up” metric (at least on TCAP)

• The Colorado Growth Model will be used to evaluate progress (at least on TCAP)

Page 4: Colorado assessment summit_teacher_eval

A finding of effectiveness or ineffectiveness is

more defensible when it is arrived at by:

1. Two or more assessments of different designs.

2. Two or more models of different designs.

3. As many cases as possible.

It is not good to choose tests or models for local

assessment in hopes that they will mimic the

state assessment.

Page 5: Colorado assessment summit_teacher_eval

If evaluators do not differentiate their ratings, then alldifferentiation comes from the test.

Page 6: Colorado assessment summit_teacher_eval

If performance ratings aren’t consistent with school growth, that will probably be public information.

Page 7: Colorado assessment summit_teacher_eval

Results of Tennessee Teacher Evaluation Pilot

0%

10%

20%

30%

40%

50%

60%

1 2 3 4 5

Value-added result

Observation Result

Page 8: Colorado assessment summit_teacher_eval

Results of Georgia Teacher Evaluation Pilot

Evaluator Rating

ineffective

Minimally Effective

Effective

Highly Effective

Page 9: Colorado assessment summit_teacher_eval

Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective

Teaching: Culminating Findings from the MET Projects Three-Year Study

Observation by Reliability coefficient(relative to state test value-added gain)

Proportion of test variance explained

Model 1 – State test – 81% Student surveys 17% Classroom Observations – 2%

.51 26.0%

Model 2 – State test – 50% Student Surveys – 25% Classroom Observation – 25%

.66 43.5%

Model 3 – State test – 33% -Student Surveys – 33% Classroom Observations – 33%

.76 57.7%%

Model 4 – Classroom Observation 50%State test – 25%Student surveys – 25%

.75 56.2%

Reliability of evaluation weights in predicted

stability of student growth gains year to year

Page 10: Colorado assessment summit_teacher_eval

Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective

Teaching: Culminating Findings from the MET Projects Three-Year Study

Observation by Reliability coefficient(relative to state test value-added gain)

Proportion of test variance explained

Principal – 1 .51 26.0%

Principal – 2 .58 33.6%

Principal and other administrator .67 44.9%

Principal and three short observations by peer observers

.67 44.9%

Two principal observations and two peer observations

.66 43.6%

Two principal observations and two different peer observers

.69 47.6%

Two principal observations one peer observation and three short observations by peers

.72 51.8%

Reliability of a variety of teacher observation

implementations

Page 11: Colorado assessment summit_teacher_eval

Testing

Metric (Growth or Gain Score)

Analysis (Value Added Effect Size and/or ranking)

Evaluation (Performance Rating)

How tests are used to evaluate teachers and

principals

Page 12: Colorado assessment summit_teacher_eval

Issues in the use of growth measures

Instructional alignment

Tests used for teacher evaluation

must align to the teacher’s

instructional responsibilities.

Page 13: Colorado assessment summit_teacher_eval

Common problems with instructional alignment

• Using school level math and reading results in the evaluation of music, art, and other specials teachers.

• Using general tests of a discipline (reading, math, science) as a major component of the evaluation high school teachers delivering specialized courses.

Page 14: Colorado assessment summit_teacher_eval

Florida Teachers Sue Over Evaluation System

New York Times, April 17, 2013

Seven Florida teachers have brought a federal lawsuit to protest job evaluation policies that tether individual performance ratings to the test scores of students who are not even in their classes. The suit, which was filed Tuesday in conjunction with three local affiliates of the National Education Association in Federal District Court for the Northern District of Florida in Gainesville, says Florida’s two-year-old evaluation system violates teachers’ rights of due process and equal protection. Under a 2011 law, schools and districts must evaluate teachers in part based on how much their students learn, as measured by standardized tests. But since Florida, like most states, administers only math and reading tests and only in selected grades, many teachers do not teach tested

subjects. One of the plaintiffs, a first-grade teacher, was rated on the basis of test scores of students in a different school in her district, and another, who teaches vocational classes to aspiring health care workers, was rated based on test scores of students in grades and subjects she had never taught. “This lawsuit highlights the absurdity of the current evaluation system,” said Andy Ford, president of the Florida Education Association.

Page 15: Colorado assessment summit_teacher_eval

Expect consistent inconsistency!

Page 16: Colorado assessment summit_teacher_eval

Inconsistency occurs because

• Of differences in test design.

• Differences in testing conditions.

• Differences in models being applied to evaluate growth.

Page 17: Colorado assessment summit_teacher_eval

Test Retest

Test 1 Time 1

Test 2 Time 1

Test 1 Time 2

Test 2 Time 2

The reliability problem –

Inconsistency in testing conditions

Page 18: Colorado assessment summit_teacher_eval

Test 1 Time 1

Test 2 Time 1

Test 1 Time 2

Test 2 Time 2

The reliability problem –

Inconsistency in testing conditions

Test 1 Time 1

Test 2 Time 1

Test 1 Time 2

Test 2 Time 2

Test 1 Time 1

Test 2 Time 1

Test 1 Time 2

Test 2 Time 2

Page 19: Colorado assessment summit_teacher_eval

The problem with spring-spring testing

3/11 4/11 5/11 6/11 7/11 8/11 9/11 10/11 11/11 12/11 1/12 2/12 3/12

Teacher 1 Summer Teacher 2

Page 20: Colorado assessment summit_teacher_eval

Characteristics of value-added metrics

• Value-added metrics are inherently NORMATIVE.

• If below average = partially effective then half of the average staff will be partially effective.

• Value-added metrics can’t measure progress of the larger group over time.

• Extreme performance is more likely to have alternate explanations.

Page 21: Colorado assessment summit_teacher_eval

New York City

• Margins of error can be very large

• Increasing n doesn't always decrease the margin of error

• The margin of error in math is typically less than reading

Page 23: Colorado assessment summit_teacher_eval

“The findings indicate that these modeling

choices can significantly influence outcomes

for individual teachers, particularly those in

the tails of the performance distribution who

are most likely to be targeted by high-stakes

policies.”

Ballou, D., Mokher, C. and Cavalluzzo, L. (2012) Using Value-Added Assessment for Personnel

Decisions: How Omitted Variables and Model Specification Influence Teachers’ Outcomes.

Instability at the tails of the

distribution

LA Times Teacher #1

LA Times Teacher #2

Page 24: Colorado assessment summit_teacher_eval

“Significant evidence of bias plagued the value-added model

estimated for the Los Angeles Times in 2010, including significant

patterns of racial disparities in teacher ratings both by the race of

the student served and by the race of the teachers (see

Green, Baker and Oluwole, 2012). These model biases raise the

possibility that Title VII disparate impact claims might also be filed

by teachers dismissed on the basis of their value-added estimates.

Additional analyses of the data, including richer models using

additional variables mitigated substantial portions of the bias in the

LA Times models (Briggs & Domingue, 2010).”

Baker, B. (2012, April 28). If it’s not valid, reliability doesn’t

matter so much! More on VAM-ing & SGP-ing

Teacher Dismissal.

Possible racial bias in models

Page 25: Colorado assessment summit_teacher_eval

Inconsistency among the Colorado

Growth Model and other value-added

approaches.

Page 26: Colorado assessment summit_teacher_eval

Issues with the Colorado Growth Model

• When applied to MAP it discards the advantages of a cross-grade scale and robust growth norms.

• It is a descriptive and not a causal model.

• As currently applied it does not control for factors outside the teacher’s influence that may affect student growth.

Page 27: Colorado assessment summit_teacher_eval

A brief commentary on the Colorado Growth

Model

It’s limitations

•It does not support inference.

•It does not take advantage of the

useful characteristics of a vertical

scale.

•It uses only prior scores and past

testing history to evaluate growth.

Page 28: Colorado assessment summit_teacher_eval

A brief commentary on the Colorado Growth

Model

Other limitations

•The model can’t be used for cross-

state comparisons.

• The model is problematic for

assessing long-term trends.

Page 29: Colorado assessment summit_teacher_eval

Measurement Issues

Moving from the model to the teacher rating

Page 30: Colorado assessment summit_teacher_eval

Translating ranked data to ratings -principles

• There is no “science” per se around translating a ranking to a rating. If you call a bottom 40% teacher ineffective that is a judgment.

• The rating process can be politicized.

• The process is easy to over-engineer.

Page 31: Colorado assessment summit_teacher_eval

New York Rating System

• 60 points assigned from classroom observation

• 20 points assigned from state assessment

• 20 points assigned from local assessment

• A score of 64 or less is rated ineffective.

Page 32: Colorado assessment summit_teacher_eval

Ineffective (Growth

Measures)Developing (Growth Measures) Effective (Growth Measures) Highly Effective (Growth Measures)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

In

eff

ecti

ve (

Ob

servati

on

al)

0 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

1 2 3 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6

2 2 4 5 6 6 6 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9

3 2 5 6 7 7 8 8 9 9 9 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12

4 3 5 7 8 9 9 10 10 11 11 11 12 12 12 12 13 13 13 13 13 13 14 14 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 15

5 3 6 8 9 10 11 11 12 12 13 13 14 14 14 14 15 15 15 15 16 16 16 16 16 16 16 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18

6 3 6 8 10 11 12 13 13 14 14 15 15 16 16 16 17 17 17 17 18 18 18 18 18 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 21

7 3 7 9 11 12 13 14 15 15 16 16 17 17 18 18 18 19 19 19 20 20 20 20 20 21 21 21 21 21 22 22 22 22 22 22 22 23 23 23 23 23

8 3 7 10 11 13 14 15 16 17 17 18 18 19 19 20 20 20 21 21 21 22 22 22 23 23 23 23 23 24 24 24 24 24 24 25 25 25 25 25 25 25

9 3 8 10 12 14 15 16 17 18 18 19 20 20 21 21 22 22 23 23 23 24 24 24 24 25 25 25 25 26 26 26 26 26 27 27 27 27 27 27 28 28

10 3 8 11 13 14 16 17 18 19 20 20 21 22 22 23 23 24 24 25 25 25 26 26 26 27 27 27 27 28 28 28 28 29 29 29 29 29 29 30 30 30

11 3 8 11 13 15 17 18 19 20 21 22 22 23 24 24 25 25 26 26 27 27 27 28 28 28 29 29 29 30 30 30 30 31 31 31 31 31 32 32 32 32

12 4 8 12 14 16 17 19 20 21 22 23 24 24 25 26 26 27 27 28 28 29 29 29 30 30 30 31 31 31 32 32 32 33 33 33 33 33 34 34 34 34

13 4 9 12 14 16 18 20 21 22 23 24 25 26 26 27 28 28 29 29 30 30 31 31 31 32 32 33 33 33 34 34 34 34 35 35 35 35 36 36 36 36

14 4 9 12 15 17 19 20 22 23 24 25 26 27 27 28 29 30 30 31 31 32 32 33 33 33 34 34 35 35 35 36 36 36 37 37 37 37 38 38 38 38

15 4 9 13 15 18 19 21 23 24 25 26 27 28 29 29 30 31 31 32 33 33 34 34 35 35 35 36 36 37 37 37 38 38 38 39 39 39 40 40 40 40

Develo

pin

g (

Ob

servati

on

al)

16 4 9 13 16 18 20 22 23 25 26 27 28 29 30 31 31 32 33 33 34 35 35 36 36 37 37 37 38 38 39 39 39 40 40 40 41 41 41 42 42 42

17 4 9 13 16 19 21 23 24 25 27 28 29 30 31 32 33 33 34 35 35 36 37 37 38 38 39 39 39 40 40 41 41 42 42 42 43 43 43 44 44 44

18 4 10 14 17 19 21 23 25 26 28 29 30 31 32 33 34 35 35 36 37 37 38 38 39 40 40 41 41 41 42 42 43 43 44 44 44 45 45 45 46 46

19 4 10 14 17 20 22 24 26 27 28 30 31 32 33 34 35 36 36 37 38 39 39 40 40 41 42 42 43 43 43 44 44 45 45 46 46 46 47 47 47 48

20 4 10 14 17 20 22 24 26 28 29 31 32 33 34 35 36 37 38 38 39 40 41 41 42 42 43 43 44 45 45 45 46 46 47 47 48 48 48 49 49 49

21 4 10 14 18 21 23 25 27 29 30 31 33 34 35 36 37 38 39 40 40 41 42 42 43 44 44 45 45 46 46 47 47 48 48 49 49 50 50 50 51 51

22 4 10 15 18 21 23 26 27 29 31 32 34 35 36 37 38 39 40 41 42 42 43 44 44 45 46 46 47 47 48 48 49 49 50 50 51 51 52 52 52 53

23 4 10 15 18 21 24 26 28 30 31 33 34 36 37 38 39 40 41 42 43 43 44 45 46 46 47 48 48 49 49 50 50 51 51 52 52 53 53 54 54 54

24 4 11 15 19 22 24 27 29 31 32 34 35 36 38 39 40 41 42 43 44 45 45 46 47 48 48 49 50 50 51 51 52 52 53 53 54 54 55 55 56 56

25 4 11 15 19 22 25 27 29 31 33 34 36 37 39 40 41 42 43 44 45 46 47 47 48 49 50 50 51 52 52 53 53 54 54 55 55 56 56 57 57 58

26 4 11 16 19 23 25 28 30 32 34 35 37 38 39 41 42 43 44 45 46 47 48 49 49 50 51 51 52 53 53 54 55 55 56 56 57 57 58 58 59 59

27 4 11 16 20 23 26 28 30 32 34 36 37 39 40 42 43 44 45 46 47 48 49 50 50 51 52 53 53 54 55 55 56 57 57 58 58 59 59 60 60 61

28 4 11 16 20 23 26 29 31 33 35 37 38 40 41 42 44 45 46 47 48 49 50 51 52 52 53 54 55 55 56 57 57 58 59 59 60 60 61 61 62 62

29 4 11 16 20 24 26 29 31 34 35 37 39 40 42 43 45 46 47 48 49 50 51 52 53 54 54 55 56 57 57 58 59 59 60 61 61 62 62 63 63 64

30 4 11 16 20 24 27 30 32 34 36 38 40 41 43 44 45 47 48 49 50 51 52 53 54 55 56 56 57 58 59 59 60 61 61 62 62 63 64 64 65 65

Eff

ecti

ve (

Ob

servati

on

al)

31 4 11 17 21 24 27 30 32 35 37 39 40 42 43 45 46 47 49 50 51 52 53 54 55 56 57 57 58 59 60 61 61 62 63 63 64 64 65 66 66 67

32 4 11 17 21 25 28 30 33 35 37 39 41 43 44 46 47 48 50 51 52 53 54 55 56 57 58 59 59 60 61 62 62 63 64 64 65 66 66 67 68 68

33 4 12 17 21 25 28 31 33 36 38 40 42 43 45 46 48 49 50 52 53 54 55 56 57 58 59 60 61 61 62 63 64 64 65 66 66 67 68 68 69 69

34 4 12 17 21 25 28 31 34 36 38 40 42 44 46 47 49 50 51 53 54 55 56 57 58 59 60 61 62 63 63 64 65 66 66 67 68 68 69 70 70 71

35 4 12 17 22 25 29 32 34 37 39 41 43 45 46 48 49 51 52 53 55 56 57 58 59 60 61 62 63 64 64 65 66 67 68 68 69 70 70 71 72 72

36 4 12 17 22 26 29 32 35 37 39 41 43 45 47 49 50 52 53 54 55 57 58 59 60 61 62 63 64 65 66 66 67 68 69 69 70 71 72 72 73 74

37 4 12 17 22 26 29 32 35 38 40 42 44 46 48 49 51 52 54 55 56 58 59 60 61 62 63 64 65 66 67 68 68 69 70 71 71 72 73 74 74 75

38 4 12 18 22 26 30 33 36 38 40 43 45 46 48 50 52 53 55 56 57 58 60 61 62 63 64 65 66 67 68 69 69 70 71 72 73 73 74 75 75 76

39 4 12 18 22 26 30 33 36 39 41 43 45 47 49 51 52 54 55 57 58 59 61 62 63 64 65 66 67 68 69 70 71 71 72 73 74 75 75 76 77 77

40 4 12 18 23 27 30 33 36 39 41 44 46 48 50 51 53 55 56 57 59 60 61 63 64 65 66 67 68 69 70 71 72 73 73 74 75 76 77 77 78 79

41 4 12 18 23 27 31 34 37 39 42 44 46 48 50 52 54 55 57 58 60 61 62 63 65 66 67 68 69 70 71 72 73 74 75 75 76 77 78 78 79 80

42 5 12 18 23 27 31 34 37 40 42 45 47 49 51 53 54 56 58 59 60 62 63 64 66 67 68 69 70 71 72 73 74 75 76 76 77 78 79 80 80 81

43 5 12 18 23 27 31 34 37 40 43 45 47 49 51 53 55 57 58 60 61 63 64 65 66 68 69 70 71 72 73 74 75 76 77 78 78 79 80 81 82 82

44 5 12 18 23 28 31 35 38 41 43 46 48 50 52 54 56 57 59 60 62 63 65 66 67 69 70 71 72 73 74 75 76 77 78 79 80 80 81 82 83 84

45 5 13 19 24 28 32 35 38 41 44 46 48 51 53 54 56 58 60 61 63 64 66 67 68 69 71 72 73 74 75 76 77 78 79 80 81 82 82 83 84 85

Hig

hly

Eff

ecti

ve (

Ob

servati

on

al)

46 5 13 19 24 28 32 35 39 41 44 47 49 51 53 55 57 59 60 62 63 65 66 68 69 70 71 73 74 75 76 77 78 79 80 81 82 83 83 84 85 86

47 5 13 19 24 28 32 36 39 42 45 47 49 52 54 56 58 59 61 63 64 66 67 69 70 71 72 74 75 76 77 78 79 80 81 82 83 84 85 85 86 87

48 5 13 19 24 29 32 36 39 42 45 47 50 52 54 56 58 60 62 63 65 66 68 69 71 72 73 74 76 77 78 79 80 81 82 83 84 85 86 87 87 88

49 5 13 19 24 29 33 36 40 43 45 48 50 53 55 57 59 61 62 64 66 67 69 70 71 73 74 75 77 78 79 80 81 82 83 84 85 86 87 88 89 89

50 5 13 19 24 29 33 37 40 43 46 48 51 53 55 57 59 61 63 65 66 68 69 71 72 74 75 76 77 79 80 81 82 83 84 85 86 87 88 89 90 90

51 5 13 19 25 29 33 37 40 43 46 49 51 54 56 58 60 62 64 65 67 69 70 72 73 74 76 77 78 79 81 82 83 84 85 86 87 88 89 90 91 92

52 5 13 19 25 29 33 37 41 44 47 49 52 54 56 58 61 62 64 66 68 69 71 72 74 75 77 78 79 80 82 83 84 85 86 87 88 89 90 91 92 93

53 5 13 19 25 30 34 37 41 44 47 50 52 55 57 59 61 63 65 67 68 70 72 73 75 76 77 79 80 81 82 84 85 86 87 88 89 90 91 92 93 94

54 5 13 20 25 30 34 38 41 44 47 50 53 55 57 60 62 64 66 67 69 71 72 74 75 77 78 80 81 82 83 85 86 87 88 89 90 91 92 93 94 95

55 5 13 20 25 30 34 38 41 45 48 50 53 56 58 60 62 64 66 68 70 71 73 75 76 78 79 80 82 83 84 85 87 88 89 90 91 92 93 94 95 96

56 5 13 20 25 30 34 38 42 45 48 51 54 56 58 61 63 65 67 69 70 72 74 75 77 78 80 81 82 84 85 86 87 89 90 91 92 93 94 95 96 97

57 5 13 20 25 30 35 38 42 45 48 51 54 56 59 61 63 65 67 69 71 73 74 76 78 79 81 82 83 85 86 87 88 90 91 92 93 94 95 96 97 98

58 5 13 20 26 30 35 39 42 46 49 52 54 57 59 62 64 66 68 70 72 73 75 77 78 80 81 83 84 85 87 88 89 90 92 93 94 95 96 97 98 99

59 5 13 20 26 31 35 39 43 46 49 52 55 57 60 62 64 66 68 70 72 74 76 77 79 81 82 83 85 86 88 89 90 91 92 94 95 96 97 98 99 100

60 5 13 20 26 31 35 39 43 46 49 52 55 58 60 63 65 67 69 71 73 75 76 78 80 81 83 84 86 87 88 90 91 92 93 95 96 97 98 99 100 101

Page 34: Colorado assessment summit_teacher_eval

Unintended Consequences?

• Many principals and teachers (including good ones) will seek schools or teaching assignments that they think will improve their results.

• Principals and teachers may game the system, inadvertently or intentionally.

• Many teachers will seek opportunities to avoid grades with standardized tests.

• Ranking metrics can discourage cooperation among principals and teachers – finding ways to reward teamwork and cooperation are important.

Page 35: Colorado assessment summit_teacher_eval

Case Study #1 - Mean value-added performance in mathematics by school – fall to spring

-8.00

-6.00

-4.00

-2.00

0.00

2.00

4.00

6.00

Page 36: Colorado assessment summit_teacher_eval

Case Study #1 - Mean spring and fall test duration in minutes by school

0.00

10.00

20.00

30.00

40.00

50.00

60.00

70.00

80.00

90.00

Spring term

Fall term

Page 37: Colorado assessment summit_teacher_eval

-10.00

-8.00

-6.00

-4.00

-2.00

0.00

2.00

4.00

6.00

8.00

Students taking 10+ minutes longer spring than fall All other students

Case Study #1 - Mean value-added growth by school and test duration

Page 38: Colorado assessment summit_teacher_eval

Differences in fall-spring test durations

Case Study # 2

15%

25%

60%

Mathematics

Spring < Fall Spring = Fall Spring > Fall

0.0

1.0

2.0

3.0

4.0

5.0

6.0

Spring < Fall Spring = Fall Spring > Fall

Gro

wth

Ind

ex

Mathematics

Differences in growth index score based on fall-spring test durations

Page 39: Colorado assessment summit_teacher_eval

Case Study # 2

42%

33%

25%

Fall < Spring Fall = Spring Fall > Spring

-5.0

-4.5

-4.0

-3.5

-3.0

-2.5

-2.0

-1.5

-1.0

-0.5

0.0

Fall < Spring Fall = Spring Fall >Spring

Differences in spring -fall test durations Differences in raw growth based by spring-fall test duration

How much of summer loss is really summer loss?

Page 40: Colorado assessment summit_teacher_eval

Case Study # 2

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

9.0

10.0

0

20

40

60

80

100

120

140

160

180

200

Gro

wth

Ind

ex

Min

ute

s

School

Growth Index Fall test duration Spring test duration

Differences in fall-spring test duration (yellow-black) and

Differences in growth index scores (green) by school

Page 41: Colorado assessment summit_teacher_eval

Negotiated goals – Student Learning Objectives

• Negotiated goals (SLOs) are likely to be necessary in some subjects.

• It is difficult to set fair and reasonable goals for improvement absent norms or context.

• It is likely that some goals will be absurdly high and others way too low.

Page 42: Colorado assessment summit_teacher_eval

Ways to evaluate the attainability of a goal

• Prior performance

• Performance of peers within the system

• Performance of a norming group

Page 43: Colorado assessment summit_teacher_eval

One approach to evaluating the attainment of goals.

Students in La Brea Elementary School show

mathematics growth equivalent to only 2/3 of the

average for students in their grade.

Level 4 – (Aspirational) – Students in La Brea Elementary School will

improve their mathematics growth equivalent to 1.5 times the average

for their grade.

Level 3 – (Proficient) Students in La Brea Elementary School will

improve their mathematics growth equivalent to the average for their

grade.

Level 2 – (Marginal) Students in La Brea Elementary School will

improve their mathematics growth relative to last year.

Level 1 – (Unacceptable) Students in La Brea Elementary School

do not improve their mathematics growth relative to last year.

Page 44: Colorado assessment summit_teacher_eval

Is this goal attainable?

62% of students at John Glenn Elementary met or exceeded

proficiency in Reading/Literature last year. Their goal is to improve

their rate to 82% this year. Is the goal attainable?

362 351 291173

73 14 30

100

200

300

400

Growth > -30%

> -20% > -10% > 0% > 10% > 20% > 30%

Oregon schools – change in Reading/Literature proficiency 2009-10 to 2010-11 among schools that started with

60% proficiency rates

Page 45: Colorado assessment summit_teacher_eval

Is this goal attainable and rigorous?

45% of the students at La Brea elementary showed average growth or

better last year. Their goal is to improve that rate to 50% this year. Is

their goal reasonable?

0%

20%

40%

60%

80%

100%

Students with average or better annual growth in Repus school district

Page 46: Colorado assessment summit_teacher_eval

The selection of metrics matters

Students at LaBrea Elementary School

will show growth equivalent to 150% of

grade level.

Students at Etsaw Middle School will

show growth equivalent to 150% of grade

level.

Page 47: Colorado assessment summit_teacher_eval

Scale score growth relative to NWEA’s growth norm in mathematics

-1.0

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

2 3 4 5 6 7 8 9

Scal

e S

core

Gro

wth

Growth Index

Page 48: Colorado assessment summit_teacher_eval

Percent of a year’s growth in mathematics

0%

20%

40%

60%

80%

100%

120%

140%

160%

180%

200%

2 3 4 5 6 7 8 9

Pe

rce

nt

of

a Ye

ar’s

Gro

wth

Mathematics

Page 49: Colorado assessment summit_teacher_eval

Assessing the difficult to measure

• Encourage use of performance assessment and rubrics.

• Encourage outside scoring

– Use of peers in other buildings, professionals in the field, contest judges

• Make use of resources

– Music educator, art educator, vocational professional associations

– Available models – AP art portfolio.

– Use your intermediate agency

– Work across buildings

• Make use of classroom observation.

Page 50: Colorado assessment summit_teacher_eval

Possible legal issues

• Title VII of the Civil Rights Act of 1964 –Disparate impact of sanctions on a protected group.

• State statutes that provide tenure and other related protections to teachers.

• Challenges to a finding of “incompetence” stemming from the growth or value-added data.

Page 51: Colorado assessment summit_teacher_eval

Recommendations

• Embrace the formative advantages of growth measurement as well as the summative.

• Create comprehensive evaluation systems with multiple measures of teacher effectiveness (Rand, 2010)

• Select measures as carefully as value-added models.

• Use multiple years of student achievement data.

• Understand the issues and the tradeoffs.

Page 52: Colorado assessment summit_teacher_eval

Presenter - John Cronin, Ph.D.

Contacting us:

NWEA Main Number: 503-624-1951

E-mail: [email protected]

This PowerPoint presentation and recommended

resources are available at our Slideshare site

http://www.slideshare.net/JohnCronin4/colorado-

assessment-summitteachereval

Thank you for attending this event