stanag 6001 tester training f… · netherlands defence language centre. bilc stanag 6001 testing...
TRANSCRIPT
![Page 1: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/1.jpg)
Gerard SeinhorstNetherlands Defence Language Centre
BILC STANAG 6001 Testing Workshop 2017Skopje, Macedonia
Some considerations on
STANAG 6001 Tester Training
![Page 2: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/2.jpg)
Outline
• The importance of tester training
• The effects of training - Research findings
• General and role-specific tester training
• References
![Page 3: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/3.jpg)
‘If we do not know what raters are doing,
then we do not know what their ratings mean’
Connor-Linton (1995)
Introduction
![Page 4: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/4.jpg)
Causes of rater variability
rater variability
severity / leniency
background / mother tongue
interpretation of rating scale
application of rating criteria
central tendency halo / ordering
effect
inconsistencies / randomness
ambiguous questions
bias
![Page 5: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/5.jpg)
Interaction in assessment
Rater
Criteria
Performance
Task
Test taker
Interlocutor
Rating
Conditions
Adapted from McNamara (1997)
Ratable sample
![Page 6: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/6.jpg)
The role of the interlocutor
• Shift of focus in language testing – from maximizing reliability to optimizing validity
• Potential of the interlocutor to affect the quality of the test taker’s performance differences in the way that interlocutors interact with test takers tester personality - bias tester stance different elicitation techniques
• Impact on the quality of test taker’s performance affecting the validity and fairness of tests and the rating given
![Page 7: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/7.jpg)
The importance of training
Perform the required task to a common standard gain knowledge of assessment methodology and testing principles reach a common understanding/interpretation of rating scale achieve consistency in the application of the rating criteria minimize tester idiosyncrasies and construct-irrelevant variance follow standardized testing procedures enhance alignment of ratings
Increase/maintain professionalism and quality make informed decisions selection and qualification/certification of testers
Ensure that the inferences made on the basis of the test results are valid, accurate, and fair
![Page 8: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/8.jpg)
Tester training leads to higher inter-rater reliability, but not necessarily to higher intra-rater reliability
Tester training effects do not persist; raters tend to become more lenient over time
NNS raters tend to be more severe re. grammar/vocab errors, but more lenient re. interference of L1 accent
Experience is no guarantee for accurate ratings Rating criteria are applied more reliably when they are
accompanied by benchmark performances Item writing guidelines are more effective when they are
supported by examples of ‘strong’ and ‘weak’ items and statistical data
Research findings
![Page 9: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/9.jpg)
Implications
Tester background (mother tongue, gender, teaching experience) does not play a decisive role in becoming a good examiner
Training needs to be followed up at regular intervals to ensure that standards are maintained. Ideally, each testing session should be preceded by a renorming/recalibration session to reduce interlocutor idiosyncrasies and rater variability
Training cannot be expected to remove all variability. The number of test occasions has a far greater impact on reliability than tester training or employing multiple raters
Practice is the key to becoming proficient in item writing, conducting speaking tests and rating performances
Research findings
![Page 10: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/10.jpg)
Language Tester Roles
ADMINISTRATORDEVELOPER
INTERLOCUTOR RATER
![Page 11: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/11.jpg)
ADMINISTRATOR
INTERLOCUTOR
DEVELOPER
RATER
• test specs• CTA alignment
• item/prompt writing• moderation
• pretesting• test/item analysis
• validation
Language Tester Roles
![Page 12: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/12.jpg)
INTERLOCUTOR
DEVELOPER
RATER
ADMINISTRATOR
• scheduling• scoring procedures• test admin protocols• test security• retesting policy• reporting test results• test certificates• reproduction/storage
Language Tester Roles
![Page 13: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/13.jpg)
ADMINISTRATORDEVELOPER
INTERLOCUTOR RATER• speaking test protocol• elicitation techniques• ratable speech sample• tester behaviour• dealing with non-
standard performances
Language Tester Roles
![Page 14: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/14.jpg)
INTERLOCUTOR
DEVELOPER ADMINISTRATOR
RATER
• norming/benchmarks• rating criteria• consistency/reliability• analytic and holistic
rating• adjudication
Language Tester Roles
![Page 15: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/15.jpg)
Types of Tester Training
TEST DEVELOPER / ADMINISTRATOR TRAINING
INTERLOCUTOR / RATER TRAINING
GENERAL TESTER
TRAINING
ADMINISTRATORDEVELOPER
INTERLOCUTOR RATER
![Page 16: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/16.jpg)
General Tester TrainingImportant Topics (not exhaustive)
general testing
principles
test types
dichotomies in testing
characteristics of STANAG
6001 testing
familiarization with the scale
construct definition
test purpose and format samples of
benchmark performances
CTA requirements
examples of ‘good’ and ‘bad’ items
key concepts
![Page 17: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/17.jpg)
Test Developer/Administrator TrainingImportant Topics (not exhaustive)
scheduling
test design
cheating & test fraud
piloting & pretesting
test specifications
item / prompt writing
item banking testtechniques
CTAalignment
test development
process
test & item analysis
item review & moderation
examiner handbook
item types test security
test admin protocols
testcertificates
uniform test conditions
![Page 18: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/18.jpg)
Interlocutor/Rater TrainingImportant Topics (not exhaustive)
tester certification
test structure
rating factors
tester stance
elicitation techniques
‘floor’ and ‘ceiling’
practice speaking tests
ratablesample
CTAexpectations
benchmark performances
tester behaviour
variety of topics
bias / halo effect
linguistic breakdown
rating protocol
feedback to test takers
renorming sessions
uniform test conditions
Code of Ethics
![Page 19: STANAG 6001 Tester Training f… · Netherlands Defence Language Centre. BILC STANAG 6001 Testing Workshop 2017. Skopje, Macedonia. Some considerations on. STANAG 6001 Tester Training](https://reader034.vdocuments.us/reader034/viewer/2022052204/5f1d501def58721dee53cb88/html5/thumbnails/19.jpg)
References / Further readingAssociation of Language Testers in Europe (ALTE). (2005). ALTE Materials for the guidance of test items
writers. 1995, updated 2005. http://www.alte.org/attachments/files/item_writer_guidelines.pdf
Brown, A. Interlocutor and rater training. In: Fulcher, G., and Davidson, F. (Eds.) The Routledge Handbook of Language Testing. Routledge: 413-425
Connor-Linton, J. (1995). Looking behind the curtain: what do L2 composition ratings really mean? TESOL Quarterly 29: 762-65.
Fulcher, G., and Davidson, F. (2007). Language testing and assessment: An advanced resource book.Routledge.
Hogan, T.P., and Murphy, G. (2007). Recommendations for preparing and scoring constructed-response items; What the experts say. Applied Measurement in Education, 20(4): 427-41.
McNamara, T.F. (1997). “Interaction” in second language performance assessment: Whose performance? Applied Linguistics, 18(4), 446-65.
Schedl, M. and Malloy, J. (2014). Writing Items and Tasks. In: Kunnan, J. (Ed.) The Companion to Language Assessment. John Wiley & Sons, 788-804.
Van Moere, A. (2014). Raters and Ratings. In: Kunnan, J. (Ed.) The Companion to Language Assessment. John Wiley & Sons, 1340-57.