1 the new adaptive version of the basic english skills test oral interview dorry m. kenyon funded by...
TRANSCRIPT
1
The New Adaptive Version of the Basic English Skills Test
Oral Interview
Dorry M. Kenyon
Funded by OVAE Contract: ED-00-CO-0130
The BEST Plus
2
Overview
1. Why the BEST Plus?
2. What does the BEST Plus look like?
3. What is its research base?
4. How can the BEST Plus be used?
3
Overview
Why the BEST Plus? What does the BEST Plus look like? What is its research base? How can the BEST Plus be used?
4
The original BEST Oral Interview
Developed early 1980s Assessed basic functional oral English language
skills for adult immigrants and refugees Designed for program use Began to be widely used for accountability purposes
5
0 1 2 3
1. Where is he? 2. In <#3 in 1B>, where did you buy your food? 3. Is shopping in <#3 in 1B> and <#4 in 1B> the
same? How is it different/the same?
6
The BEST Plus
A performance-based assessment (individually administered face-to-face oral interview)
Assesses functional oral language skills (interpersonal communication) of adult ESL learners using everyday language
Designed with current assessment needs in mind
7
Goals in developing the BEST Plus
Respond to adult ESL program needs for assessment and accountability
– Produce a test that is short and practical– Assess learner language for a variety of purposes and
stakeholders– Increase accuracy in measuring oral proficiency– Provide “multiple forms” for pre- and post-testing
8
Overview
Why the BEST Plus? What does the BEST Plus look like? What is its research base? How can the BEST Plus be used?
9
BEST Plus components (computer-based version)
Test items appear on the computer screen (instead of in a test booklet)
If an item requires a visual, examinees view the visual on the computer screen (instead of a picture cue booklet)
Test administrators enter scores directly into the computer (instead of on a score sheet)
14
BEST Plus components (print-based version)
Three forms Within each form, locator test + three level tests
– SPL1-4– SPL 4-6– SPL 6-10
Materials– Picture booklet– Test booklet (scripts and score sheet combined)
16
Scoring on 3 components of proficiency
Listening Comprehension = How well did the examinee understand the setup and question?
Language Complexity = How did the examinee organize and elaborate the response?
Communication = How clearly did the examinee communicate meaning?
17
Ability estimation
After each question, the program estimates the examinee’s ability based on scores awarded on the current and all previous questions.
With each estimation, the accuracy of the measurement increases.
Goal: To ‘level off’ in estimation with acceptable level of accuracy.
18
Path through the computer-adaptive BEST Plus
Following a fixed “warm-up,” examinees are asked questions drawn from several thematically-based “folders.”
After hearing each response, the test administrator enters a score for each component.
After each set of scores is entered, the computer updates its estimate of the examinee’s ability, and chooses folders and questions as appropriate.
The test ends when one of three conditions is met. Users can instantly receive full score report.
19
Path through the print-based BEST Plus
Administer and score Locator questions (the fixed “warm-up” items + 2 high end discriminators)
Total score on Locator and choose level test based on chart
Administer level test Total raw score and find approximate SPL range Enter raw scores into computer BEST Plus Score
Management software to obtain full score report
20
Overview
Why the BEST Plus? What does the BEST Plus look like? What is its research base? How can the BEST Plus be used?
21
Rigorous development procedures
• Feasibility study (1999-2000)• Initial development (2000-2001)• Pilot, small scale field test, initial reliability study (2001)• Revisions (2001-2002)• Pilot, full scale field test, reliability study, standard setting
study (2002)• Finalization of training materials, ancillary materials, further
refinements (2003)
22
Full involvement of stakeholders
• OVAE oversight• Technical Working Group (TWG), comprised of
researchers, state directors, and local program directors and practitioners
• Item writers, comprised of experienced adult ESL teaching professionals
• Instructors and students in the field
23
Example: Full scale field test participants
• 9 states (DC, DE, FL, IL, MA, MD, OR, PA, VA)
• 23 programs• 41 administrators• 2420 examinees
24
Example: Reliability study 2002
• 32 adult ESL students• Two testing rooms (A, B)
• Administrator (project staff)• Observer/Co-Scorer (project staff)• Observer/Co-Scorer (novice scorer)
• Each student was tested, then immediately retested in second room
25
Average interrater agreement
Within administration (same room)
Total Score Room A
(3 raters)
Room B
(3 raters)
2002 .98 .97
27
Example: Some initial validity evidences
• Analyses of ancillary data collected from program records during the field test, including test scores less than six months old
• Standard setting study
28
Correlations with program placement
Range of Correlation
Number of Programs
Percentage
.80 or above 7 30.4%
.70 to .79 9 39.1%
.60 to .69 3 13.1%
.50 to .59 3 13.1%
Below .50 1 4.3%
TOTALS 23 100%
30
Example: Standard setting study
11 judges 30 student performances Performances (about 6 min each) arranged from
lowest to highest Judgment made: “Which SPL is best characterized
by this performance?” Judges were able to complete this task relating the
SPL descriptors to the observed performances
31
Overview
Why the BEST Plus? What does the BEST Plus look like? What is its research base? How can the BEST Plus be used?
32
The BEST Plus Score Report
Information includes:– BEST Plus Scale Score– SPL level– NRS level– Diagnostic information
33
Uses of the BEST Plus
Accountability– National Reporting System (NRS), as scores on the
BEST Plus relate to the 6 NRS levels for Speaking and Listening
– Program Evaluation
34
Standard setting outcome (SPLs)
SPL Scale Score Range
0 Below 330
1 330-400
2 401-417
3 418-438
4 439-472
5 473-506
6 507-540
7 541-598
8 599-706
9 707-795
10 Above 795
35
Standard setting outcome (NRS)
NRS Level Related SPL BEST Plus Scale Scores
Beginning ESL Literacy 0-1 Below 401
Beginning ESL 2-3 401-438
Low Intermediate ESL 4 439-472
High Intermediate ESL 5 473-506
Low Advanced ESL 6 507-540
High Advanced ESL 7 or more Above 540
37
Diagnostic score report information
SPL N Average Listening
Average Complexity
Average Communication
0 118 .37 .22 .43
1 312 .87 .52 1.12
2 120 1.13 .73 1.49
3 169 1.25 .80 1.72
4 336 1.43 .95 2.03
5 318 1.60 1.15 2.33
6 270 1.73 1.31 2.55
7 340 1.85 1.50 2.77
8 317 1.91 1.91 2.88
9 96 1.95 2.34 2.95
10 24 1.98 2.85 2.99
Maximum Possible 2.00 4.00 3.00
38
Example (diagnostic information)
SPL = 5 Listening Language Complexity
Communication
Examinee 1.20 1.57 2.20 Average for SPL 5 1.60 1.15 2.33
Relative to other SPL 5s, current examinee is:
• Low in listening
• High in complexity
• Average in communication