1 scoring provincial large-scale assessments maría elena oliveri, university of british columbia...

Post on 06-Jan-2018

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

3 LSAs are administered to collect data to evaluate efficacy of school systems, guide policy-making make decisions regarding improving student learning An accurate scoring process examined in relation to the purposes of the test, and the decisions the assessment data are intended to inform are key to obtaining useful data from these assessments Scoring Provincial Large-Scale Assessments

TRANSCRIPT

1

Scoring Provincial Large-Scale Assessments

María Elena Oliveri,University of British Columbia

Britta Gundersen-Bryden,British Columbia Ministry of Education

Kadriye Ercikan,University of British Columbia

1

Objectives

Describe and Discuss– Five steps used to score provincial large-scale assessments (LSAs)

– Advantages and challenges associated with diverse scoring models (e.g., centralized versus decentralized)

– Lessons learned in British Columbia when switching from a centralized to a decentralized scoring model

3

LSAs are administered to• collect data to evaluate efficacy of school systems, • guide policy-making • make decisions regarding improving student learning

An accurate scoring process examined in relation to the purposes of the test, and the decisions the assessment data are intended to inform are key to obtaining useful data from these assessments

Scoring Provincial Large-Scale Assessments

Accuracy in Scoring

• Essential to having accurate & meaningful scores is the degree to which scoring rubrics: – (1) appropriately and accurately identify relevant aspects of responses as

evidence of student performance, – (2) are accurately implemented– (3) are consistently applied across examinees

• Uniformity in scoring LSAs is central to achieving comparability of students’ responses: ensure differences in results are attributable to differences among examinees’ performance rather than due to biases introduced by the use of differing scoring procedures

• A five-step process is typically used4

Step One: “Test Design Stage”

• Design of test specifications

– That match the learning outcomes or construct(s) assessed

– Include particular weights & number of items needed to assess each intended construct

5

Step Two: “Scoring Open-Response Items”

Decide which model to use to score open-response items:

• Centralized models are directly supervised by provincial Ministries or Departments of Education in a central location

• Decentralized models often take place across several locations & are performed by a considerably greater number of teachers; used for scoring medium to low-stakes LSAs

6

Step Three: “Preparing Training Materials”

• Identify common tools to train scorers, including:

– Exemplars of students’ work demonstrating each of the scale points in the scoring rubric

– Illustrate potential biases arising in the scoring process (e.g., differences in scores given to hand- vs. type-written essays)

7

Step Four: “Training of Scorers”

• Training occurs prior to scoring and can recur during the session itself, especially if the session spans more than one day

• A “train the trainer” approach is often used– a small cadre of more experienced team leaders are trained first, then they

train other scorers who will actually score the responses

• Team leaders often make final judgement calls on the assignment of scores differing from exemplars

• Serves to reinforce common standards and consistency in the assignment of scores and leads to having fair and accurate scores

8

Step Five: “Monitoring Scores”

• Includes checks for inter-marker reliability, wherein a sample of papers is re-scored to check consistency in scoring across raters

• May serve as re-training or “re-calibration” activity, with raters discussing scores and rationales for their scoring procedures

9

The Foundation Skills Assessment

• The Foundation Skills Assessment (FSA) will be used as a case study to illustrate advantages and challenges associated with switching from a centralized and decentralized scoring model

• The FSA assess Grade 4 and 7 students’ skills in reading, writing and numeracy

• Several changes made to the FSA in 2008 as a response to stakeholders’ demands to have more meaningful LSAs that informed classroom practice

10

Changes to the FSA

• Earlier administration– from May to February

• Online administration of closed-response sections• Parents or guardians received child’s open-response test portions

& summary statement of reading, writing and numeracy skills• Scoring model changed from a centralized to a decentralized

model• Ministry held “train the trainer” workshops to prepare school

district personnel to organize and conduct local scoring sessions• School districts could decide how to conduct scoring sessions

– score individually, in pairs or in groups– double-score only a few, some or all the responses

11

Advantages of a Decentralized Model

• Professional Development– A decentralized model allowed four times as many teachers to

work with scoring rubrics and exemplars– Educators were able to develop a deeper understanding of

provincial standards and expectations for student achievement– If scorers are educators, they may later apply knowledge of

rubrics and exemplars in their classroom practice and school environments and consider the performance of their own students in a broader provincial context

12

Advantages of a Decentralized Model

• Earlier return of test results & earlier provision of feedback to teachers, students and the school– More immediate feedback may lead to improving learning and

guiding student teaching

– Data informs teachers about students’ strengths and areas of improvement in relation to provincial standards

– May be helpful in writing school plans and targeting the areas upon which particular schools may focus

13

Challenges of a Decentralized Scoring Model

• Increased difficulty associated with– Less time allocated to implementing cross-check procedures – Decreased standardization of scoring instructions given to

raters– Increased costs (higher number of teachers scoring)– Reduced training time

14

Potential Solutions

• Provide teachers with adequate training time – e.g., one to two days of training prior to scoring the assessments

• Increase discussion among teachers, which may involve reviewing exemplars falling in between scale points in the rubric

• Have table leaders – e.g., teachers with prior scoring experience

• Re-group teachers to verify difficulties or uncertainties related to the scoring process

15

Final Note

• Closer collaboration among educators and Ministries and Departments of Education may lead to improved tests as educators bring their professional experience of how students learn in the classroom to bear on test design itself

• Strong alignment between the overall purposes of the test, the test design and the scoring model used may add value to score interpretation and subsequent use of assessment results

16

1717

Thank you

top related