tips for good test design
TRANSCRIPT
Terminology brush-upEvaluation, assessment, and testing
Types of tests
Cornerstones of testing: usefulness, validity, reliability, practicality, washback, authenticity, transparency
Alignment
Evaluation, assessment, and testEvaluation – concerned with the overall program; considers all aspects of teaching and learning (Genessee, 2001, as cited in Coobe et al., 2007)
Assessment – a variety of ways of collecting information on learners’ achievement or ability
Test – a type of assessment tool
Usefulness
Any language test must be developed with a specific
purpose, a particular group of test-takers, and a specific
language use in mind. (Bachman and Palmer, 1996, as cited in Coombe et al, 2007)
Validity
The extent to which the test measures what it purports to measure – content, construct,
and face validity.
Reliability
Consistency of test scores: formats and content of questions and the time given to students to
take the exam must be consistent. The more items on a
test, the more reliable it is.
Practicality
Are teachers able to develop, administer, and mark the test within the available time and with the available resources?
Authenticity
Tests should reflect authentic, real-life uses of the language and
use authentic or authentic-like materials as much as possible.
Transparency
The availability of clear, accurate information to students about
testing: outcomes to be evaluated, formats used,
weighing of items, time allowed, grading criteria.
LEARNING OBJECTIVES
ASSESSMENTS
INSTRUCTIONAL ACTIVITIES
A L I G N M E N T
Eberly Center for Teaching Excellence, Carnegie Mellon University
Before writing the test – make a blueprint / test specifications
Write down all
the learning
outcomes you want
to test
Decide how you are going to assess each
outcome – what type of item will
you use? Multiple-choice? Fill-in-the blanks?
Constructed response? Make
sure they are aligned with the
instructional strategies.
Balance types of items.
Decide how much each test item is going to be
worth. What percentage of the test will each
item represent?
Decisions about what to include on the test
If you teach the four skills, the four skills should be represented in your assessment system. If you have an oral test, then include the other three skills on the test.
.
Decisions about what to include on the test
Balance grammar, language functions, and vocabulary (grammatical competence, pragmatic competence, and discourse competence).
.
Decisions about what to include on the test
Think about how the test will reflect the types of activities you do in your classroom. Think about how the test will affect the teaching, that is, the washback effect.
.
Testing listening and reading
Choose texts that are neither too easy nor too hard for the students; use the i+1 principle.
Choose text genres that match the genres presented in the instructional activities.
Testing listening and reading
Try to use texts that are as authentic as possible.
Include items that test listening and reading for the main idea and listening and reading for details.
Testing listening and reading
If the listening or reading is a bit difficult, use easier items. If it is easier, use more challenging items. Grade the task (in the sense of adapting to the level) according to the input.
Make sure there aren’t unknown words/structures in the listening or reading prompts. You are testing the reading of the text, not of the reading comprehension items!
Go beyond multiple-choice or True or False
Other types of items are:
- Listen and fill in the blanks- Correct wrong information- Read and complete a summary- Listen or read and fill in the chart / fill out a form (information
transfer)- Number events in the correct order- Relate the text to a picture- Listen and draw- Insert sentences into the reading.
Be careful!
Make sure students can’t answer the questions just by common sense, without listening or reading.
Make sure the questions are not ambiguous or based on subjective interpretation.
Be careful!
Avoid “not stated” for listening because there’s no time to process the text and also what’s not in it.
Make sure the listening is at a pace that matches the listenings used in class.
Be careful!
Arrange the listening items in the same order that the information appears in the listening.
Only assess the listening and reading strategies that were taught during the period, as this is an achievement test, not a proficiency one.
Avoid tricky items!
Grammar, functions, and vocabulary
Whenever possible, contextualize the item: use a dialogue or a paragraph rather than isolated sentences (discourse competence).
When you contextualize, choose topics that are relevant/familiar to students.
Grammar, functions, and vocabulary
Balance selected-response and constructed-response items. The more advanced the test, the more constructed-response items it should have.
Types of vocabulary items
- Fill-in-the blanks with words from a word bank; always include at least one extra word.
- Match the word with the sentence it can complete: always include an extra item on the right, not on the left to prevent correction confusion (more numbers than parentheses).
Types of vocabulary items
- Complete the sentences with words; provide the first letter of each word; Ss have to remember the words: useful for more common words; active vocabulary.
- Crossword puzzles: active vocabulary- Multiple-choice- Odd-man out
Types of grammar / language function items
- Multiple-choice- Editing- Fill-in-the blanks of sentences, dialogues, or
paragraphs- Complete the dialogue- Answer questions- Sentence transformation (from active to passive;
from direct to indirect speech)- Write sentences based on a chart or graph
Be careful!Avoid exceptions / rare cases.
Have in mind what you consider passive knowledge and active knowledge to decide the item type.
Don’t be more wordy than you need in your paragraphs and dialogues.
Be careful!Don’t make the items too mechanical, with almost the same answer in all sentences.
Stick to your instructional strategies - ex: if students didn’t do an exercise mixing two tenses, don’t include this on the test!
Provide a correction key for reliability purposes: what should be given partial credit to? What should not?
Controversial issues
Should timed, one-shot writings (paragraphs and essays) be part of a test?
Should we provide a glossary of some unknown words for the reading?
How many times should the listening be played?