1 scoring provincial large-scale assessments maría elena oliveri, university of british columbia...

Scoring Provincial Large-Scale Assessments

María Elena Oliveri,University of British Columbia

Britta Gundersen-Bryden,British Columbia Ministry of Education

Kadriye Ercikan,University of British Columbia

Objectives

Describe and Discuss– Five steps used to score provincial large-scale assessments (LSAs)

– Advantages and challenges associated with diverse scoring models (e.g., centralized versus decentralized)

– Lessons learned in British Columbia when switching from a centralized to a decentralized scoring model

LSAs are administered to• collect data to evaluate efficacy of school systems, • guide policy-making • make decisions regarding improving student learning

An accurate scoring process examined in relation to the purposes of the test, and the decisions the assessment data are intended to inform are key to obtaining useful data from these assessments

Scoring Provincial Large-Scale Assessments

Accuracy in Scoring

• Essential to having accurate & meaningful scores is the degree to which scoring rubrics: – (1) appropriately and accurately identify relevant aspects of responses as

evidence of student performance, – (2) are accurately implemented– (3) are consistently applied across examinees

• Uniformity in scoring LSAs is central to achieving comparability of students’ responses: ensure differences in results are attributable to differences among examinees’ performance rather than due to biases introduced by the use of differing scoring procedures

• A five-step process is typically used4

Step One: “Test Design Stage”

• Design of test specifications

– That match the learning outcomes or construct(s) assessed

– Include particular weights & number of items needed to assess each intended construct

Step Two: “Scoring Open-Response Items”

Decide which model to use to score open-response items:

• Centralized models are directly supervised by provincial Ministries or Departments of Education in a central location

• Decentralized models often take place across several locations & are performed by a considerably greater number of teachers; used for scoring medium to low-stakes LSAs

Step Three: “Preparing Training Materials”

• Identify common tools to train scorers, including:

– Exemplars of students’ work demonstrating each of the scale points in the scoring rubric

– Illustrate potential biases arising in the scoring process (e.g., differences in scores given to hand- vs. type-written essays)

Step Four: “Training of Scorers”

• Training occurs prior to scoring and can recur during the session itself, especially if the session spans more than one day

• A “train the trainer” approach is often used– a small cadre of more experienced team leaders are trained first, then they

train other scorers who will actually score the responses

• Team leaders often make final judgement calls on the assignment of scores differing from exemplars

• Serves to reinforce common standards and consistency in the assignment of scores and leads to having fair and accurate scores

Step Five: “Monitoring Scores”

• Includes checks for inter-marker reliability, wherein a sample of papers is re-scored to check consistency in scoring across raters

• May serve as re-training or “re-calibration” activity, with raters discussing scores and rationales for their scoring procedures

The Foundation Skills Assessment

• The Foundation Skills Assessment (FSA) will be used as a case study to illustrate advantages and challenges associated with switching from a centralized and decentralized scoring model

• The FSA assess Grade 4 and 7 students’ skills in reading, writing and numeracy

• Several changes made to the FSA in 2008 as a response to stakeholders’ demands to have more meaningful LSAs that informed classroom practice

Changes to the FSA

• Earlier administration– from May to February

• Online administration of closed-response sections• Parents or guardians received child’s open-response test portions

& summary statement of reading, writing and numeracy skills• Scoring model changed from a centralized to a decentralized

model• Ministry held “train the trainer” workshops to prepare school

district personnel to organize and conduct local scoring sessions• School districts could decide how to conduct scoring sessions

– score individually, in pairs or in groups– double-score only a few, some or all the responses

Advantages of a Decentralized Model

• Professional Development– A decentralized model allowed four times as many teachers to

work with scoring rubrics and exemplars– Educators were able to develop a deeper understanding of

provincial standards and expectations for student achievement– If scorers are educators, they may later apply knowledge of

rubrics and exemplars in their classroom practice and school environments and consider the performance of their own students in a broader provincial context

Advantages of a Decentralized Model

• Earlier return of test results & earlier provision of feedback to teachers, students and the school– More immediate feedback may lead to improving learning and

guiding student teaching

– Data informs teachers about students’ strengths and areas of improvement in relation to provincial standards

– May be helpful in writing school plans and targeting the areas upon which particular schools may focus

Challenges of a Decentralized Scoring Model

• Increased difficulty associated with– Less time allocated to implementing cross-check procedures – Decreased standardization of scoring instructions given to

raters– Increased costs (higher number of teachers scoring)– Reduced training time

Potential Solutions

• Provide teachers with adequate training time – e.g., one to two days of training prior to scoring the assessments

• Increase discussion among teachers, which may involve reviewing exemplars falling in between scale points in the rubric

• Have table leaders – e.g., teachers with prior scoring experience

• Re-group teachers to verify difficulties or uncertainties related to the scoring process

Final Note

• Closer collaboration among educators and Ministries and Departments of Education may lead to improved tests as educators bring their professional experience of how students learn in the classroom to bear on test design itself

• Strong alignment between the overall purposes of the test, the test design and the scoring model used may add value to score interpretation and subsequent use of assessment results

Thank you

1 scoring provincial large-scale assessments maría elena oliveri, university of british columbia...

Documents

british columbia case

invermere & columbia valley, british columbia ... ·...

supernatural british columbia

resourceguide2004 british columbia

mission, british columbia · mission, british columbia

british columbia - wildlife habitat canada · 2017-2018...

branch - british columbia

british columbia ferry services inc. victoria, british...

british columbia

nitesh british columbia

province of british columbia land reserve … of british...

ministry of british columbia agriculture agriservice ... ·...

victoria, british columbia

stefano oliveri

under - british columbia

canada – british columbia

dealer inspire · richmond, british columbia, canada...

royal british columbia museum - bc budget 2020 · 2004. 9....

british columbia utilities commission (bcuc) british...

métis nation british columbia a guide to the métis nation...