fyc8

8/10/2019 FYC8

1/4

Im p roving Multip le Ch oice Question s

CTL Number 8 November 1990

of information, they tend to rely heavily on this type ofquestion. Multiple-choice tests in the teachers manualsthat accompany textbooks are often composed exclu-sively of recognition or recall items.

Cogn itive LevelsTeachers are aware of the fact that there are differentlevels of learning, but most of us lack a convenientscheme for using the concept in course planning ortesting. Psychologists have elaborate systems forclassifying cognitive levels, but a simple three-levelscheme is sufficient for most test planning purposes.

We will designate the three categories recall, application,and evaluation. At the lowest level, recall, studentsremember specific facts, terminology, principles, ortheoriese.g., recollecting the equation for Boyles Lawor the characteristics of Impressionist painting. At themedian level, application, students use their knowledgeto solve a problem or analyze a situatione.g., solving achemistry problem using the equation for Boyles Law oapplying ecological formulas to determine the outcomeof a predator-prey relationship. The highest level,evaluation, requires students to derive hypotheses fromdata or exercise informed judgmente.g., choosing anappropriate experiment to test a research hypothesis ordetermining which economic policy a country shouldpursue to reduce inflation. Since it is difficult to con-struct multiple choice questions at the complex applica-tion level, we have included an example at the end ofthis article.

Analyzing your course material in terms of these threecategories will help you design exams that sample boththe range of content and the various cognitive levels atwhich the students must operate. Performing thisanalysis is an essential step in designing multiple choicetests that have high validity and reliability. The testmatrix, sometimes called a table of specifications, is aconvenient tool for accomplishing this task.

Although multiple choice questions are widely used inhigher education, they have a mixed reputation amongfaculty because they seem inherently inferior to essayquestions. Many teachers believe that multiple choiceexams are only suitable for testing factual information,whereas essay exams test higher-order cognitive skills.Poorly-written multiple choice items elicit complaints

from students that the questions are picky or confus-ing. But multiple choice examscantest many of thesame cognitive skills that essay tests do, and, if theteacher is willing to follow the special requirements forwriting multiple choice questions, the tests will bereliable and valid measures of learning. In this article,we have summarized some of the best informationavailable about multiple choice exams from the prac-tices of effective teachers at UNC and from testing andmeasurement literature.

Validity and ReliabilityThe two most important characteristics of an achieve-ment test are its content validity and reliability. A testsvalidity is determined by how well it samples the rangeof knowledge, skills, and abilities that students weresupposed to acquire in the period covered by the exam.

Test reliability depends upon grading consistency anddiscrimination between students of differing perform-ance levels. Well-designed multiple choice tests aregenerally more valid and reliable than essay testsbecause (a) they sample material more broadly; (b)discrimination between performance levels is easier todetermine; and (c) scoring consistency is virtually

guaranteed.

The validity of multiple choice tests depends uponsystematic selection of items with regard to bothcontent and level of learning. Although most teacherstry to select items that sample the range of contentcovered by an exam, they often fail to consider the levelof the questions they select. Moreover, since it is easyto develop items that require only recognition or recall

Center for Teaching and LearningUniversity of North Carolina at Chapel Hill

8/10/2019 FYC8

2/4

The Test MatrixThe test matrix is a simple two-dimensional table thatreflects theimp ortanceof the concepts and the emphasisthey were given in the course. At the initial stage of testplanning, you can use the matrix to determine theproportion of questions that you need in each cell of thetable. For example, the chart on the right reflects aninstructional unit in which the students spent 20% oftheir time and effort at the recall and simple application

levels on Topic I. On the other hand, they spent 25% oftheir effort on the complex application level of Topic II.

This matrix has been simplified to illustrate the principleof proportional allocation; in practice, the table will bemore complex. The greater the detail and precision ofthe matrix, the greater the validity of the exam.

As you develop questions for each section of the test,record the question numbers in the cells of the matrix.

The table below shows the distribution of 33 questionson a test covering a unit in a psychology course.

Cognitive Levels

Recall Application EvaluationTopics

I 10% 10% 5%

II 5% 10% 25%III 5% 10% 20%

In practice, you may have to leave some of the cellsblank because of limitations in the number of questionsyou can ask in a test period. Obviously, these elementsof the course should be less significant than those youdo choose to test.

Recall Application EvaluationTopics

A. Identify crisis vs. role 2, 9 4, 21, 33 16 18%confusion; achievementmotivation.

B. Adolescent sexual 5, 8 1,13, 26 11 18%behavior; transition ofpuberty.

C. Social isolation and self 14, 6 3, 20 25 15%esteem; person perception.

D. Egocentrism; adolescent 7, 29 12, 31 10, 15, 27 21%

idealism.E. Law and maintenance of 17 22 18 9%

the social order.

F. Authoritarian bias; moral 19 30 24 9%development.

G. Universal ethical principle 28 23 32 9%orientation.

33% 40% 27%

Guidelines for Writing Questions

Constructing good multiple choice items requires plentyof time for writing, review, and revision. If you write afew questions after class each day when the material isfresh in your mind, the exam is more likely to reflectyour teaching emphases than if you wait to write themall later. Writing questions on three-by-five index cardsor in a word-processing program will allow you to re-arrange, add, or discard questions easily.

The underlying principle in constructing good multiplechoice questions is simple: the questions must be askedin a way that neither rewards test wise students norpenalizes students whose test-taking skills are lessdeveloped. The following guidelines will help youdevelopquestions that measure learning rather than skill intaking tests.

8/10/2019 FYC8

3/4

6. Avoidgiving verbal clues that give away the correctanswer. These include: grammatical or syntacticalerrors; key words that appear only in the stem andthe correct response; stating correct options intextbook language and distractors in everydaylanguage; using absolute termse.g., always,never, all, in the distractors; and using two re-sponses that have the same meaning.

General Issues1. All questions should stand on their own. Avoidusingquestions that depend on knowing the answers toother questions on the test. Also, check your examto see if information given in some items providesclues to the answers on others.

2. Randomize the position of the correct responses.One author says placing responses in alphabeticalorder will usually do the job. For a more thoroughrandomization, use a table of random numbers.

Writing the StemThe stem of a multiple-choice item poses a problemor states a question. The basic rule for stem-writing isthat students should be able to understand the questionwithout reading it several times and without having toread all the options.

1. Write the stem as a single, clearly-stated problem.

Direct questions are best, but incomplete statementsare sometimes necessary to avoid awkward phrasingor convoluted language.

2. State the question as briefly as possible, avoidingwordiness and undue complexity. In higher-levelquestions the stem will normally be longer than inlower-level questions, but you should still be brief.

3. State the question in positiveform because studentsoften misread negatively phrased questions. If youmust write a negative stem, emphasize the negativewords with underlining or all capital letters. Do notuse double negativese.g., Which of these is not

the least important characteristic of the Sovieteconomy? Analyzing the Respo n ses

After the test is given, it is important to perform a test-item analysis to determine the effectiveness of thequestions. Most machine-scored test printouts includestatistics for each question regarding item difficulty,item discrimination, and frequency of response for eachoption. This kind of analysis gives you the informationyou need to improve the validity and reliability of yourquestions.

Item difficulty is denoted by the percentage of studentswho answered the question correctly, and since thechance of guessing correctly is 25%, you should rewrite

any item that falls below 30%. Testing authoritiessuggest that you should strive for items that yield a widerange of difficulty levels, with an average difficulty ofabout 50%. Also, if you expect a question to be par-ticularly difficult or easy, results that vary widely fromyour expectation merit investigation.

To derive the item discrimination index, the class scoresare divided into upper and lower halves and thencompared for performance on each question. For anitem to be a good discriminator, most of the uppergroup should get it right and most of the lower groupshould miss it. A score of .50 on the index reflects a

maximal level of discrimination, and test experts sug-gest that you should reject items below .30 on thisindex. If equal numbers of students in each half answerthe question correctly, or if more students in the lowerhalf than in the upper half answer it correctly (a nega-tive discrimination), the item should not be counted inthe exam.

Finally, by examining the frequency of responses for theincorrect options under each question, you can deter-mine if they are equally distracting. If no one chooses aparticular option, you should rewrite it before using thequestion again.

Writing the Respo n sesMultiple-choice questions usually have four or fiveoptions to make it difficult for students to guess thecorrect answer. The basic rules for writing responses are(a) students should be able to select the right responsewithout having to sort out complexities that havenothing to do with knowing the correct answer and (b)they should not be able to guess the correct answerfrom the way the responses are written.

1. Write the correct answer immediately after writing

the stem and make sure it isunq uestionablycorrect.In the case of best answer responses, it should bethe answer that authorities would agree is the best.

2. Write the incorrect options to match the correctresponse in length, complexity, phrasing, and style.

You can increase the believability of the incorrectoptions by including extraneous information and bybasing the distractors on logical fallacies or commonerrors, but avoid using terminology that is com-pletely unfamiliar to students.

3. Avoidcomposing alternatives in which there are onlymicroscopically fine distinctions between the an-swers, unless the ability to make these distinctions isa significant objective in the course.

4. Avoidusing all of the above or both A & B asresponses, since these options make it possible forstudents to guess the correct answer with only partialknowledge.

5. Use the option none of the above with extremecaution. It isonlyappropriate for exams in whichthere are absolutely correct answers, like math tests,and it should be the correct response about 25% ofthe time in four-option tests.

8/10/2019 FYC8

4/4

fyc8

Documents