Do THIS, Not THAT When Writing Multiple-Choice Questions

Learn The Best Way To Write Multiple-Choice Questions

We often have to make sure that people can perform as needed before we let them perform on the job. For example, we don’t want to find out that bank tellers cannot count out the correct amount when working with customers. We certainly don’t want to assess people’s ability to put out a fire at work by setting a fire and seeing what they can do.

Assessment often needs to be practical, but it must also give us the information we need. In situations like fires or hazardous waste spills, it’s common to assess a simulated performance instead of a real performance.

Research shows that multiple-choice questions can, when well-written, simulate performance, especially if we are asking about decisions and steps to take. Well-written multiple-choice questions have a lot of advantages as an assessment method, including:

Easier scoring (and less bias in scoring) than other assessment formats
The ability to assess more content in a shorter amount of time
The capacity to analyze how well a question performs (using item statistics in a testing tool or LMS) so we can fix problems

Notice that I was just discussing, “well-written multiple-choice questions.” But most aren’t well-written, according to research. Research shows that too many multiple-choice questions have a host of flaws. And those flaws damage assessments, confuse participants and cause a lot of other problems for people and organizations.

5 major multiple-choice question flaws are found below:

They are unclear or otherwise poorly written.
They are too easy to guess.
They test recall of content not use of knowledge.
They don't measure what they intend to measure.
They end up testing something else, rather than evaluating to see if the test taker knows the content.

Let’s look at some examples of multiple-choice questions with typical flaws.

Example 1

You are using your outdoor wood-burning fire pit at your home for the first time. You should not operate the fire pit: (Select the best answer)

A) With seasoned wood
B) Under an overhang
C) With non-combustible materials near the fire pit
D) Using a screen to protect from sparks

The correct answer is B. You should not operate your wood-burning fire pit under an overhang. Overhanging structures or trees can easily ignite from flying sparks. Answers A, C, and D are safety measures for using an outdoor fire pit, and since they are correct, people are likely to select them and get the question wrong.

Research shows that negatively worded questions are harder to understand and easier to mess up when answering. A recent review of high stakes nursing exams lists negatively worded questions as the second most common question writing flaw.

Do this: Instead of writing negatively worded multiple-choice questions, phrase them positively. So, instead of asking, "Which of these should you not do…," consider using this prompt, "Which of these is an approved safety measure…"

Example 2

Here’s another example of a poorly written multiple-choice question. Can you tell what’s wrong with it?

If sparks from the wood-burning fire pit ignite materials outside of the fire pit, you should: (Select the correct answer)

A) Unintended consequences of an outdoor fire
B) Any chairs should be moved away from the fire
C) Put out the fire with water from a garden hose or a fire blanket

The biggest problem with this question is that the wording gives away the answer, as only C satisfactorily completes the sentence. It’s obviously the correct answer as a result. And here’s a related problem: because two of the three choices do not complete the stem (the question being asked or problem to be solved), the three answer choices dwindle down to one. Not okay.

Do this: Make sure that wording does not give away the answer to the question.

Example 1 is unclear. Example 2 is too easy to guess. These problems make your assessments less valid. And invalid assessments are giant problems waiting to happen.

Validity is one of the most important, if not the most important, issues when writing multiple-choice questions. Validity means that the assessment measures what it says it measures. Valid assessments can measure deeper levels of learning and create deeper processing, which improves the ability to apply.

Because poorly written multiple-choice questions are a widespread problem, I decided to do something about it. My first #deeperlearningatwork online course is on writing multiple-choice questions. You can find out more about it here. Or find information under “Courses” on my website.

References:

Abedi, J. (2006). Language issues in item development. In S. M. Downing and T. M. Haladyna (Eds.), Handbook of Test Developmen
Chiavaroli, N. (2017). Negatively-worded multiple choice questions: An avoidable threat to validity, Practical Assessment, Research & Evaluation, 22(3).
Downing, S.M. (2005). The effects of violating standard item writing principles on tests and students: The consequences of using flawed test items on achievement examinations in medical education. Advances in Health Sciences Education: Theory and Practice, 10(2), 133–143.
Haladyna, T.M. & Downing, S.M. (1989). A taxonomy of multiple-choice item writing rules. Applied Measurement in Education, 2(1), 37–50.
Haladyna, T.M. & Downing, S.M. (1989). Validity of a taxonomy of multiple-choice item writing rules. Applied Measurement in Education, 2(1), 51–78.
Hopkins, K.D. (1998). Educational and psychological measurement and evaluation. Needham Heights, MA: Allyn & Bacon.
Lemons, P.P. & Lemons, J.D. (2013). Questions for assessing higher-order cognitive skills: It's not just Bloom’s. CBE-Life Sciences Education, 12(1), 47-58.
Marsh, E. J. & Cantor, A. D. (2014). Chapter 02: Learning from the test: Dos and don’ts for using multiple-choice tests, in McDaniel, M. A., Frey, R. F., Fitzpatrick, S. M., & Roediger, H. L. (Eds.), Integrating Cognitive Science with Innovative Teaching in STEM Disciplines, Washington University, Saint Louis, Missouri.
Roediger, H. L., III, & Marsh, E. J. (2005). The positive and negative consequences of multiple-choice testing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 1155-1159.
Schuwirth, L. W. T. & van der Vleuten, C. P. M. (2004). Different written assessment methods: what can be said about their strengths and weaknesses? Medical Education, 38, 974–979.
Shrock, S. A. & Coscarelli, W. C. C. (1989). Criterion-referenced test development. Reading, MA: Addison-Wesley.
Tarrant, M., Knierim, A., Hayesm S.K., & Ware, J. (2006). The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments. Nurse Education Today 26(8), 662–671.