conceptualizing performance standards for alternate assessments steve ferrara american institutes...
Post on 20-Dec-2015
216 views
TRANSCRIPT
Conceptualizing Performance Standards for
Alternate Assessments
Steve FerraraAmerican Institutes for Research
Suzanne SwaffieldSouth Carolina Department of Education
Lorin MuellerAmerican Institutes for Research
Presentation at the Eighth Annual Maryland Conference: Alternate Assessment, October 11-12, 2007
Eighth Annual Maryland Conference 2
Overview
Intended meaning and interpretation of performance standards Definitions of Proficient (etc.) Students with the most significant cognitive disabilities
Status and growth standards Lots to consider; we haven’t addressed all considerations
Comments on standard setting methods Methods must match the assessment design and intended
inferences and uses of scores Articulation of standards
Eighth Annual Maryland Conference 3
General principles
Whatever matters for grade-level achievement assessments matters for assessments of alternate achievement standards Conception, design, development, analysis,
psychometric evaluation, standard setting Achievement (aka performance) standards
A coherent system Content standards, assessment tasks, score
reporting scale, PLDs, cut scores on the scale
Eighth Annual Maryland Conference 4
Definitions of Proficient (etc.)
Eighth Annual Maryland Conference 5
Features and considerations for definitions of Proficient
Are appropriate for all participating students Are aligned with the extended standards and the
assessment Are reasonable and rigorous Differentiate expectations across performance levels
and grade bands Are articulated across grade bands Relate sensibly to modified and grade-level
achievement standards Reflect input from stakeholders
Eighth Annual Maryland Conference 6
Achievement construct definitions represented by the assessment and the PLDs
Reading E.g., Decoding and comprehending only?
Listening and comprehending? Writing
E.g., Physical act only? Creating a permanent record?
Mathematics conceptual understandings and skills
Science conceptual understandings and skills
Eighth Annual Maryland Conference 7
Approaches to defining and differentiating
levels of performance in PLDs
Descriptions of performance that are moderately explicit about assessment tasks
Descriptions of performance that are highly explicit about assessment tasks
Descriptions of the amount of understanding and skill
Descriptions of quality, frequency, or consistency of performance of specific skills
Descriptions of amount of achievement progress in relation to alternate content standards
Eighth Annual Maryland Conference 8
Moderately explicit references to assessment tasks
ELA, Proficient The student interacts purposefully with
literacy materials and demonstrates some reading strategies. When a story is read or signed, the student knows what the story is about and can answer who, what, where, and why questions; make predictions based on cause and effect; and use prior knowledge to relate to the story.
Eighth Annual Maryland Conference 9
Highly explicit references to assessment tasks
Reading, Proficient (and under revision) The student identifies signs and symbols;
identifies letter sound relationships; blends sounds to make words; identifies a detail using pictures, symbols, or words from a story read aloud; identifies own name in print; and displays an understanding of print directionality.
Eighth Annual Maryland Conference 10
Amount of achievement progress in relation to alternate content standards
ELA, Proficient If there is evidence of progress [in relation to
the grade-level content standards] in three data collection periods and increased complexity in two of three periods, the student progress score is Proficient.
Eighth Annual Maryland Conference 11
Very different
In how they Define Proficient Differentiate Proficient from other levels Relate to grade-level PLDs
In what it means to say a student has achieved the Proficient level
Eighth Annual Maryland Conference 12
Eighth Annual Maryland Conference 13
Growth standards
SC-Alt growth standards project supported by a MARS grant Suzanne Swaffield, SDE Scott Marion, Marianne Perie, Center for
Assessment AIR
From Ferrara S. (2007). Standards for proficient achievement growth for South Carolina’s alternate assessment, SC-Alt. In S. Davies (Organizer), Vertical Integration of Benchmarks and Standards: Including Alternate Assessments in Evaluating Growth. Presentation at the National Conference on Large-Scale Assessment, Nashville.
Eighth Annual Maryland Conference 14
Students with the most significant cognitive disabilities
Communication level is pre-symbolic Instructional focus likely to be on awareness of
surroundings and others focusing on the activity task at hand
May not reach Proficient or get out of the lowest performance level during their school careers
Status standards are not appropriate and relevant Fairness and validity concerns
What to do about that is an open question
Eighth Annual Maryland Conference 15
Growth and status standards
Status standards One score point on an alternate assessment score
scale represents Proficient performance Not a big problem for most alternate portfolio
assessments Limitation for other alternate assessment approaches:
some students may not get there during their school career
Growth standards A fixed amount of achievement growth on the
alternate assessment score scale represents Proficient performance
Eighth Annual Maryland Conference 16
The concept of Proficient achievement growth
A growth (or difference) score on the SC-Alt score scale
Learning progressions to guide, illuminate, and support development of the growth PLDs and identification of growth scores that will represent Proficient growth
Eighth Annual Maryland Conference 17
Achievement growth for students with significant cognitive disabilities
Significant challenges How much growth in academic achievement can
reasonably be expected in one year? What do learning progressions look like for students with
significant cognitive disabilities? Should we consider different expectations and
progressions for different groups of students? Approaches to responding to these questions
Expert opinion: academics, teachers Test score analysis Systematic data collection focused on learning
progressions
Eighth Annual Maryland Conference 18
Gain scores
Descriptive Statistics for 2006-2007 Gain Scores
Grade Band N Mean SD
3-5 236 4.64 56.76
6-8 348 2.28 48.81
10 76 7.13 48.45Note. 2006 field test, 2007 operational administration. Score scale mean and SD are ~500, ~80.
Eighth Annual Maryland Conference 19
Learning progressions
Committees of regular and special education teachers drafted learning progressions for extended content standards
The learning progressions idea comes from task analysis in special education (and elsewhere)
This is similar to the NRC call to develop models of learning and development (Knowing What Students Know) as part of the assessment design, development, and validation process
Eighth Annual Maryland Conference 20
Learning progression: Measurement, grade band 3-5 Attend to/manipulate object to investigate length and weight Match objects by one attribute (length, height, weight, volume) Sort/classify objects by one attribute (length, height, weight, volume) Identify instruments used for measurement (i.e., ruler, scale, clock,
thermometer, calendar) Match instrument to its function (check all the student knows) Use non-standard units to measure (e.g. use paperclips to measure
length) Use instruments for measurement (check all the student can use) Match coins to coins Match coins to pictures of coins Sort coins Identify coins Match each coin to its value
Eighth Annual Maryland Conference 21
Data collection
Teachers will collect evidence of student progress in assigned LPs (N=9) in three windows this school year
Students Low and moderate gains (i.e., 1-40 scale score
points) between 2006 and 2007 Pre-symbolic and early symbolic (N=55)
Standardized data collection with flexibility LP matrices to record nature of evidence and
level of support
Eighth Annual Maryland Conference 22
Definition of “Proficient Growth”
Use status PLDs, student gain scores, and evidence of growth on LPs to write definitions of “Proficient Growth” Expect to see vertical and horizontal growth
Intended uses of the definition Communicate expectations for the growth of achievement
of students in a school year Guide interpretation of performance of students on SC-Alt Use in conjunction with status standards (not for AYP) Set a growth “cut score”
Eighth Annual Maryland Conference 23
“Proficient Growth” (cont.)
In standard setting The definition of Proficient growth will describe
reasonable expectations for growth in each LP Each LP defines an extended standard
Reasonable expectations for growth described in the PLDs and represented by the cut scores will have conceptual, empirical, and judgmental bases
Different growth expectations for most significantly cognitively disabled students?
Eighth Annual Maryland Conference 24
Standard setting methods and alternate assessment designs
Eighth Annual Maryland Conference 25
Portfolio assessments
Evidence of Proficient performance is in collections of evidence
Proficient often is defined by improvement over the data collection periods in terms of quality or frequency of performance and level of support
Body of Work is an obvious choice Profile methods may be worth considering
Other widely used methods probably precluded by limited score scales
Eighth Annual Maryland Conference 26
Rating scales and other scaled assessments
Longer score scales enable consideration of a range of methods (e.g., Bookmark, Angoff, ID Matching)
Choice of an appropriate method should consider the score scale, the rating/assessment tasks, and the intended interpretations
For example: Requirements for setting performance standards for a
rating scale with and without supporting collections of evidence may differ
Eighth Annual Maryland Conference 27
Growth standards
The need to consider evidence to illustrate how much learning is reasonable to expect probably precludes item-based methods (e.g., Bookmark, Angoff, IDM)
The small amount of evidence that is feasible to collect probably precludes BoW
We may try the up-and-down method
Eighth Annual Maryland Conference 28
Bottom line
We typically go with what’s tried and true Can’t hurt—and it’s wise—to be a little more
thoughtful About matching the method with the assessment design
and intended interpretations of student performance About burden and cost
Summary of methods and test formats and a lot more: Perie (2007) http://www.naacpartners.org/products/Files/setting_alternate_achievement_standards.pdf
Eighth Annual Maryland Conference 29
Thanks for listening!
Steve [email protected]
Suzanne [email protected]
Lorin [email protected]