The format is the same as the midterm. Final Paper Questions1: Democracy and InequalityDrawing on Capital in the 21st. The format is the same as the midterm. Final Paper Questions1: Democracy and InequalityDrawing on Capital in the 21st.
MULTIPLE CHOICE ITEMS Introduction Burton, Sudweeks, Merrill and Wood (1991) note that
MULTIPLE CHOICE ITEMS
Burton, Sudweeks, Merrill and Wood (1991) note that educational testing experts have identified an increase in poorly constructed multiple choice items, in most cases items are so defective that the correct answer is obvious, debatable, obscure, or missing altogether. The examinee is forced to wonder what the test writer had in mind when the item was constructed (Burton, Sudweeks, Merrill and Wood, 1991). In addition to confusing and frustrating students, poorly-written test questions yield scores of dubious value that are inappropriate to use as a basis of evaluating student achievement (Burton, Sudweeks, Merrill and Wood, 1991). Well-written multiple-choice test questions SHOULD not confuse students, and yield scores that are more appropriate to use in determining the extent to which students have achieved the intended educational objectives. Multiple choice questions are commonly used because they are easy to grade and students are familiar with their structure. Most multiple choice questions involve only recognition or recall of facts, however questions can be written to test higher level thinking skills, such as problem-solving, these however are harder to write.
Most poorly-written multiple-choice test questions are characterized by at least one of the following three weaknesses:
1. They attempt to measure an objective for which they are not well-suited
2. They contain clues to the correct answer
3. They are worded ambiguously
Multiple response is a form of assessment in which examinees are asked to select or choose one/ the best possible answer (or answers) out of the choices from a list supplied by the examiner. All multiple choice question are composed of one question with multiple possible answers, including the correct answer and several incorrect answers called distracters (Burton, Sudweeks, Merrill and Wood, 1991). Students select the correct answer by circling the associated number or letter, or filling in the associated circle on the machine-readable response sheet. Students can generally respond to these types of questions quite quickly, as a result, they are often used to test student’s knowledge of a broad
range of content. Creating these questions can be time consuming because it is often difficult to generate several plausible distractors.
All standard multiple-choice test items consist of two (2) basic parts (Burton, Sudweeks, Merrill and Wood, 1991):
1. A problem (stem) – The text of the question.
2. A list of suggested solutions (alternatives) – The choices provided after the stem represent the options for the possible answers that the examiner can choose from with two sub parts:
The key – the correct answer in the list of options.
The distracters- the incorrect answers in the list of options.
The stem is the introduction (beginning) of the item that presents a problem to be solved, a question asked of the respondent, or an incomplete statement to be completed, as well as any other relevant information. In assessing higher order thinking (application) the stem can consist of multiple parts. The stem can include extended or ancillary material such as a vignette, a case study, a graph, a table, or a detailed description which has multiple elements to it (Brown and Pendlebury 2007). The stem can include any amount of information that will increase the validity of the learning objective, but should remain as a question (Gronlund 2010). In a Counseling examination in reference to a case study that was previously presented the examinee may be asked “What is the most likely diagnosis?”
In the alternatives only one answer can be keyed as correct, except in the case of the multiple response type questions, in which more than one answer is correct. The grading of a MCQ is usually on the basis of one mark for each correct answer and zero for each incorrect (Gronlund 2010). In some Courses, the examiner may also award partial credit for unanswered questions and penalize students for incorrect answers, to discourage guessing. Example in Medical Examinations at the University of the West Indies (UWI) and in the SAT’s a quarter point is deducted from the test taker’s score for each incorrect answer.
Examiners should construct stems that are clear and parsimonious and options that are explicit and unequivocal that will be selected by the examinee who achieves the learning objective. Distractors that are plausible competitors of the answer will be evidenced by the frequency with which they are chosen.
The purpose of the distractors is to appear as plausible solutions to the problem for those students who have not achieved the objective being measured by the test item. Conversely, the distractors must appear as implausible solutions for those students who have achieved the objective. Only the answer should appear plausible to these students (Burton, Sudweeks, Merrill and Wood, 1991). Plausible distractors are based on teachers anticipating wrong answers or common misconceptions.
1. Calculate the median of the following numbers: 27, 100, 15, 67, 27, 12,44, 81, 75,48
The examinee must recall the definition of the median and then apply that definition to the list of numbers. The median is the number at the midpoint of a distribution (46). A common mistake is to confuse the definitions of median, mean and mode. The mean (49) is one of the distractors, the mode (27) is also one of the distractors and the sum (496) a partial correct answer (the numerator of the mean is also a distractors). This question has one correct choice that corresponds to the objective and three distractors, all of which are plausible to the learner who has not met the learning objective (Calculate the Median).
An assessment of distractors can be used to improve classroom instruction, student performance and accountability of teachers. Assessment should provide educators with information about what a student knows and what additional instruction or intervention the student requires to likely attain desired learning outcomes to improve classroom instruction (Linn and Gronlund, 2000; Nitko, 2004).
Multiple-choice items written using the Distractor rationale taxonomy may reveal a student’s breakdown in understanding through his or her incorrect answers (Pearson, 2010). An assessment system that incorporates this methodology can indicate a student’s instructional needs in a subject area and thereby contribute to the development of a focused intervention plan.
Contemporary authorities in educational assessment have suggested extending the functional role of distractors to include a new purpose: identifying the nature of a student’s misunderstanding. Nitko (2004) observes that the Distractor selected provides a diagnostic insight into the difficulties the student is experiencing and Popham (2000) also recognizes the potential of distractors to represent the categories of incorrect responses that students make. This allows teachers to follow up with additional classroom instruction based on the most common errors made by students.
E. L. Thorndike developed an early multiple choice test, however Frederick J. Kelly is credited with first using these items as part of a large scale assessment in 1915, during the Kansas Silent Reading Test. The first all multiple choice, large scale assessment was the Army Alpha, used to assess the intelligence of World War I military recruits (Isaacs, 1994).
SOME OF THE MORE COMMON TYPES OF MULTIPLE CHOCIE QUESTIONS
SINGLE CORRECT ANSWER
In items of the single-correct-answer variety, all but one of the alternatives are incorrect; the remaining alternative is the correct answer. The student is directed to identify the correct answer
What concept is defined as ‘a brief sample of behavior obtained under standard conditions and scored according to a fixed set of rules that provide a numeric score ’ ( Douglas, 2013) ?
CLASSIFICATION The examinee classifies a person, object, or condition into one of several categories designated in the stem:
Rev. Dr. Lewin Williams was characterized as a_________________________ Theologian based on Rev. Ropers (2012) taxonomy.
The alternatives differ in their degree of correctness. Some may be completely incorrect and some correct, but one is clearly more correct than the others. This best alternative serves as the answer, while the other alternatives function as distractors. The student is directed to identify the best answer
What is chiefly responsible for the increase in student registration at the University of Texas during the last four years (C. Harrison, 2013)?
A. Compulsory education for Pastors in the Missionary Church.
B. Increased desire for education among church workers.
C. The coming of Jesus Christ, which has greatly increased religious values.
D. The increased salary that comes with more qualified church workers
The examinee indicates which option does not belong with the others
Which of the following names does not belong with the others?
A. Carlene Davis
B. Prodigal Son
D. Tommy Lee
NEGATIVE The student is directed to identify either the alternative that is an incorrect answer, or the alternative that is the worst answer. Any of the other multiple-choice varieties can be converted into this negative format.
For most educational objectives, a student’s achievement is more effectively measured by identifying a correct answer rather than an incorrect answer. The ability to identify an incorrect answer does not necessarily imply knowledge of the correct answer. For this reason, items of the negative variety are not recommended for general use. Occasionally, negative items are appropriate for objectives dealing with health or safety issues , where knowing what not to do is important. Example – When your clothes are on fire it is equally important for a person to know what to do and what not to do.
In these situations, negative items must be carefully worded to avoid confusing the student. The negative word should be placed in the stem, not in the alternatives, and should be emphasized by using underlining, italics, bold face, or CAPITALS. In addition, each of the alternatives should be phrased positively to avoid forming a confusing double negative with the stem:
Which of the following is NOT an assumption of Testing and Measurement (Frankson 2013)?
A. Test-related behavior predicts non-test-related behavior.
B. Testing and assessment can be conducted in a biased manner
C. Various sources of data enrich and part of the assessment process
D. Various sources of error are always part of the assessment process
The examinee must decide the correct consequence of one or more conditions being present in the stem:
If the true variance of a test increases but the error variance remains constant, which of the following will occur?
A. Reliability will increase
B. Reliability will decrease
C. Observed variance will decrease
D. Neither reliability nor observed variance will change
The examinee uses the two or more conditions or statements listed in the stem to draw a conclusion
Given that Mary’s raw score on a test is 60, the test mean is 59, and the standard deviation 2, what is Mary’s z score ?
In items of multiple response variety, two or more of the alternatives are keyed as correct answers; the remaining alternatives serve as distractors. The student is directed to identify each correct answer.
This variety of item can be scored in several different ways. Scoring on an all-or-none basis (one point if all the correct answers and none of the distractors are selected, and zero points otherwise), and scoring each alternative independently (one point for each correct answer chosen and one point for each distractor not chosen) are commonly used methods. Both methods, however, have distinct disadvantages. With the first method, a student who correctly identifies all but one of the answers receives the same score as a student who cannot identify any of the answers. The second method produces scores more representative of each student’s achievement, but most computer programs currently used with scoring machines do not include this method as an option. As a result, items of the multiple-response variety are not recommended.
Since an item of multiple-response variety is often simply a series of related true-false questions presented together as a group, a good alternative that avoids the scoring problems mentioned above is to rewrite it as a multiple true-false item.
What of the following is/are the purpose/s of the Rorschach?
A. Determine sexual needs
B. Examine emotional functioning
C. Determined sexuality
D. Predict Intelligence
E. Assess emotional adjustment
F. Analyze vocational choices
G. Determine academic skills
In items of the combined-response variety, one or more of the alternatives are correct answers; the remaining alternatives serve as distractors. The student is directed to identify the correct answer or answers by selecting one of a set of letters, each of which represent a combination of alternatives. Items of the combined-response variety are lower in reliability, lower in discrimination, higher in difficulty, and equal in validity when compared with similar items of the single-correct-answer and best-answer varieties (Albanese, 1990; Haladyna & Downing, 1989b). They have also been found to be lower in reliability, higher in difficulty, and equal in validity when compared with similar multiple true-false items (Frisbie, 1990).
This variety shares the disadvantage of all-or-none scoring with the multiple-response variety and has the added disadvantage of providing clues that help students with only partial knowledge detect the correct combination of alternatives. A student can identify a combination as the correct response simply by knowing that alternatives 1 and 4 are both correct. Because of these disadvantages, items of combined-response variety are not recommended.
Like the multiple-response variety, an item of the combined-response variety is often simply a series of related true-false questions presented together as a group. A good alternative that avoids the scoring and cluing problems mentioned above is to rewrite it as a multiple true-false item.
What are the main ways that a person can contract the HIV Virus (MOH, 2014)?
1. Engaging in Unprotected Oral Sex
2. Having Multiple Partners
3. Kissing and Hugging someone with HIV
4. Giving Blood .
The correct answer is:
A. 1, 2, and 3.
B. 1 and 2.
C. 2 and 4.
D. 1 only.
The examinee decides whether one, all or none of the two or more conditions or statements listed in the stem is (are) correct:
Is it true that (1) Alfred Binet was the father of intelligence testing, and (2) his first intelligence test was published in 1916 ?
A. Both 1 and 2
B. 1 but not 2
C. Not 1 but 2
D. neither 1 nor 2
A 45 year old asthmatic woman who has lived all her life in Westmoreland presents with a goitre of four years’ duration and clinical features suggestive of hypothyroidism. Likely diagnoses include
A. Iodine deficiency
C. Drug-induced goitre
D. Thyroid cancer
E. Auto immune thyroiditis
In the above question, the examiner is assessing application; this approach may be used for testing knowledge and judgement in many subjects. When grouped together, a series of true/false questions on a specific topic or scenario can test a more complex understanding of an issue. They can be structured to lead a student through a logical pathway (Brown 1997) as in the above example which simulates a medical diagnosis. Such questions may also be useful to the lecturer for diagnostic purposes, because they can reveal part of the thinking process employed by the student in order to solve the given problem.
The assertion-reason item combines elements of multiple choice and true/false question types, and allows the examiner to test more complicated issues and requires a higher level of learning. The question consists of two statements, an assertion and a reason. The student must first determine whether each statement is true. If both are true, the student must next determine whether the reason correctly explains the assertion. There is one option for each possible outcome (Frisbie, 1990).
Each question below consists of an assertion and a reason. Indicate your answer from the alternatives below by circling the appropriate letter
A. True True Reason is correct explanation.
B. True True Reason is NOT a correct explanation.
C. True False
D. False True
E. False False
1. The blood sugar level falls rapidly after hepactectomy.
The glycogen of the liver is the principal source of blood sugar.
2. Increased government spending increases inflation under all conditions.
Government spending is not offset by any form of production.
3. Chloroform has a dipole moment
The chloroform molecule is tetrahedral
Assertion-reason tests can be used to explore cause and effect and identify relationships.
When writing assertion-reason questions, the following points should be considered:
1. The reason should be a free standing sentence so that it can be considered separately fromthe assertion.
2. Avoid using minor reasons. These can result in an ambiguous question.
3. Repeat options A-E in full for each question.
4. Use all five options as keys equally
RELATIONS AND CORRELATES
The examinee determines the relationship between concepts 1 and 2 and indicates which of the concepts (a, b, c, d, etc) listed in the options is related to concept 3 in the same way that concepts 1 and 2 are related:
1. Mean is to standard deviation as median is to :
A. Average Deviation
C. Semi-Interquartile range
D. Correlation coefficient
GUIDELINES FOR WRITING MULTIPLE-CHOICE ITEMS
The primary objective in planning a test is to outline the actual course content and alignment with course objectives that the test will cover. In developing good multiple-choice items, three tasks need to be considered: Writing stems, Writing options, and on-going item development.
1. Writing a good multiple choice question, no matter what level of knowledge is being tested, begins with good course objectives. Course objectives need to be written in measurable terms.
2. Test for significant learning outcomes, the questions should be designed to test the learning objectives of the course, and not trivia associated with the subject matter. Questions should be recognized as being relevant to the goals of the course.
3. Present practical or real-world situations to the students aiming for application, analysis or evaluation.
4. Test Higher Level Cognitive Domains, the Rote memorization of facts, laws, and definitions have its place in the overall scheme. However, at least 90% of the test should be devoted to higher levels of cognition.
5. Before writing the stem, identify the one point to be tested by that item. In general, the stem should not pose more than one problem, although the solution to that problem may require more than one step.
Test for the intended intellectual skills, a question should not intentionally be a “test within a test”. Example: How many permutations are possible in overtime for Super Bowl XLVII? A student who knows how to calculate permutations and combinations (the skill being tested for in the context of a Statistics course at JTS) will not be able to answer this question if he/she has never played American Football. Keep the vocabulary consistent with the students’ level of understanding.
7. Pay special attention to the language used, the level of the language should be within reach of the students, not all examinees home language will be English. Use correct grammar throughout and avoid the use of jargon, unless you are specifically testing terminology. Second-language students will take longer to read and understand a question, misreading the question by the student may lead to the wrong answer.
8. Be sensitive to cultural and gender issues. Avoid turns of phrase and figures of speech that could reasonably be construed as racist or sexist, or which may have a cultural or religious bias.
9. Ask a knowledge colleague with expertise in the content area of the exam to review the items for possible ambiguities, redundancies or other structural difficulties. This peer reviewing process is critical in providing constructive criticism and improving composed items. Writing is a difficult task thus preparing good multiple choice items is a scholarly activity that demands time, clarity of thought, and precision in expression. Students read test items more carefully than they read anything else thus all flaws and imperfections will be exposed.
10. Instruct students to select the “best answer” rather than the “correct answer”. Thus acknowledging the fact that the distractors may have an element of truth to them and discourage arguments from students who may argue that their answer is correct as well. These questions also tend to be more difficult and discriminating than questions that merely ask for a fact.
11. The time of the exam should be sufficient that students have time for editing and other types of question revisions.
These guidelines should be observed when constructing the stems and options of high quality multiple-choice items (Aiken, 2006), this list is not meant to be exhaustive but outline some of the major guidelines:
1. Either a question or an incomplete statement may be used as the stem, but the question format is preferred. Place blanks in incomplete statement stems at the end. Construct the stem to be a complete standalone question, avoiding stereotyped phraseology, as rote responses are usually based on verbal stereotypes.
2. State the specific problem of the question or incomplete statement clearly, simply and as concretely as
possible in the stem and at a reading level appropriate for the examinees, but avoid taking questions or statements verbatim from textbooks. Avoid vague generalizations and do not include irrelevant information. It is essential that the students should know exactly what is expected of them, BE CLEAR. Without sacrificing clarity, be as concise and focused as possible. The purpose is to measure students’ knowledge ,reasoning, and ability not to engage in verbal gamesmanship. The idea is to discriminate levels of understanding, not to trap the unwary. Write questions that cannot be misunderstood, not merely questions that can be understood.
3. Avoid including non-functional information or words that do not contribute to the basis for choosing among the Stem. Often an introductory statement is included to enhance the appropriateness or significance of an item but does not affect the meaning of the problem in the item. All superfluous phrases should be excluded
The flag of Jamaica which was adopted on August 6, 1962, when the country gained independence from the British consists of three colours, what are they ?
Irrelevant material should not be used to make the answer less obvious. This tends to place too much importance on reading comprehension as a determiner of the correct option
“The presence and association of the male seems to have profound effects on female physiology in domestic animals. Research has shown that in cattle presence of a bull has the following effect:”
Research has shown that the presence of a bull has which of the following effects on cows?”
Don’t include superfluous information in the options. This is another manifestation of the desire to teach while testing and the additional information is likely to appear on the correct answer. Examinees prefer less to read and more direct questions.
5. Place as much of the item as possible in the stem. The stem should contain most of the wording in order to reduce the reading load. It is inefficient to repeat the same words in every option, and examines have less difficulty with shorter options. Example, if the point of an item is to associate a term with its definition, the preferred format would be to present the definition in the stem and several terms as options rather than to present the term in the stem and several definitions as options.
6. Employ opinion questions sparingly, when they are used, cite the authority or source of the opinion.
7. Four or five options are typical, but good items
having only two or three options can also be written. With students in the lower grades, three options are preferable to four or five. Four well-constructed options are suffice as there is only minimal improvement to the item due to that hard-to-come-by fifth option which is not worth the effort to construct. Empirical data has proven that a test of 10 items each with four options is likely a better test than a test with nine items of five options each.
8. There is no psychometric advantage to having a uniform number of options, especially if doing so results in options that are so implausible that no one or almost no one marks them. Several valid and important questions demand only three options,
After receiving pre-marital counselling the relationship between two persons will (Douglas 2013):
B. Stay about the same,
If the options have a natural order, such as dates or ages, it is advisable to arrange them accordingly; otherwise arrange the options in random or alphabetical order ( if alphabetizing does not give clues to the correct answer).
10. Use familiar language. The question should use the same terminology that was used in the course. Avoid using unfamiliar expressions or foreign language terms, unless measuring knowledge of such language is one of the goals of the question. Students are likely to dismiss distractors with unfamiliar terms as incorrect.
Write in the Active Voice. The active voice is used in a clause whose subject expresses the agent of the main verb. That is, the subject does the action designated by the verb. A sentence whose agent is marked as grammatical subject is called an active sentence. In contrast, a sentence in which the subject has the role of patient or theme is called a passive sentence, and its verb is expressed in passive voice.
12. Make all options approximately equal in length, complexity , grammatically correct and and appropriate in relation to the stem. However, do not let the stem give away the correct option by verbal associations or other clues. Avoid irrelevant clues to the correct option. Grammatical construction, for example, may lead students to reject options which are grammatically incorrect as the stem is stated. Perhaps more common and subtle, though, is the problem of common elements in the stem and in the answer. Consider the following item:
What led to the formation of the States’ Rights Party?
A. The level of federal taxation
B. The demand of states for the right to make their own laws
C. The industrialization of the South
D. The corruption of federal legislators on the issue of state taxation
13. Make all options plausible to examinees who do not know the correct answer, but make only one option correct or best. Popular misconceptions or statements that are only partially correct make good distracters. Options should be independent
14. In constructing each distracter, formulate a reason why an examinee who does not know the correct answer might select that distracter
15. Avoid, or at least minimize, the use of negative expressions such as not in either the stem or options. If this cannot be done, the negative words should always be highlighted by underlining or capitalisation. Negatives in the stem usually require that the answer be a false statement. Because students are likely in the habit of searching for true statements, this may introduce an unwanted bias.
16. A certain amount of novelty, and even humour is appropriate and may serve to interest and motivate examinees, ambiguous or tricky stems and options should not be used.
17. Use “none of the above”, “all of the above”, or “more than one of the above” sparingly. Also avoid specific determiners such as “always” or never”. Recognition of one wrong option eliminates “all of the above,” and recognition of two right options identifies it as the answer, even if the other options are completely unknown to the student. Some instructors use items with “all of the above” as yet another way of extending their teaching into the test . It just seems so good to have the students affirm, say, all of the major causes of some phenomenon. With this approach, “all of the above” is the answer to almost every item containing it, and the students soon figure this out. “none of the above” may be used as the final option, especially if the answer requires computation. Its use makes the question harder and more discriminating, because the uncertain student cannot focus on a set of options that must contain the answer. Of course, “none of the above” cannot be used if the question requires selection of the best answer and should not be used following a negative stem. Also, it is important that “none of the above” should be the answer to a reasonable proportion of the questions containing it. Specific determiners in distractors is a desperate effort to produce another, often unneeded, distractor and a statement is made incorrect by the inclusion of words like all or never
e.g., “All humans have 46 chromosomes.” Students learn to classify such statements as distractors when otherwise ignorant.
18. Place the options in stacked( paragraph) format rather than in tandem (back to back), using numbers to designate items and letters for options. Format the questions vertically, not horizontally (i.e., list the choices vertically)
19. Prepare the right number of items for the grade or age level to be tested, making each item independent of other items.
20. Construct each item to assess a single written objective. Items that are not written with a specific objective in mind often end up measuring lower-level objectives exclusively, or covering trivial material that is of little educational worth
21. Make the difficulty levels of items such that the percentage of examinees who answer the item correctly is approximately halfway between the chance (random guessing
percent and 100 percent (5 correct = 50 (k+1)/k is the number of distracters per item). The ideal question will be answered correctly by 60-65% of the tested population. This level of difficulty maximizes discrimination on exams. In the sciences at least, it can be an adventure to write items that are this easy. Instructors tend to overestimate student abilities and many item writers use their own capabilities as a yardstick,
22 .Avoid Typographical Errors and Overlapping Responses . A test wise but ignorant student will select Edward Seaga in the example below because he represents the intersection of several categories, Prime Minister, Framer of the Constitution and Cultural Icon . Some item writers consciously or unconsciously construct items of this type with the intersection invariably the correct answer.
A. Edward Seaga
B. David Coore
C. Portia Simpson Miller
D. Bob Marley
An examiner can compensate for ambiguities and mis-phrasing in grading numerical problems and essays however for multiple choice items, there must be a rigid application of grammar and logic for questions to be useful. It takes considerable time and thought to construct a good multiple choice item. Writing well-phrased stems with plausible foils is hardly ever easy. The guidelines presented here have must be supplemented with practical experience. Following the construction of the item stem, the likely more difficult task of generating options presents itself.
The challenges with ensuring that the question is grammatically correct is reinforced by the excerpt “ Rules of English” published in the Chronicle of Higher Education (May 19, 1982).
1. Don’t use no double negatives.
2. Make each pronoun agree with their antecedent.
3. Join clauses good, like a conjunction should.
4. When dangling watch them participles.
5. About them sentence fragments.
6. Verbs has to agree with their subject.
7. Just between you and I, case is important, too.
8. Don’t write run-on sentences they are hard to read.
9. Don’t use commas, which are not necessary.
10. Try to not ever split infinitives.
11. Its important to use your apostrophe’s correctly.
12. Proof read your writing to see if any words out.
ADVANTAGES AND LIMITATIONS OF MULTIPLE-CHOICE ITEMS
There are several advantages to multiple choice tests. Well written MCQ’s are very effective assessment techniques. Reliability improves with larger numbers of items on a test, and with good sampling and care over case specificity, overall test reliability can be further increased. Multiple choice tests often require less time to administer for a given amount of material than would tests requiring written responses. This results in a more comprehensive evaluation of the candidate’s extent of knowledge.
Multiple choice questions lend themselves to the development of objective assessment items, because this style of test does not require a teacher to interpret answers, test-takers are graded purely on their selections, creating a lower likelihood of teacher bias in the results. Factors irrelevant to the assessed material (such as handwriting and clarity of presentation) do not come into play in a multiple-choice assessment, and so the candidate is graded purely on their knowledge of the topic. Multiple choice tests are the strongest predictors of overall student performance compared with other forms of evaluations, such as in-class participation, case exams, written assignments, and simulation games
They are however not a panacea, they have advantages and limitations just as any other type of test item. Teachers need to be aware of these characteristics in order to use multiple- choice items effectively.
Versatility. Multiple-choice test items are appropriate for use in many different subject-matter areas, and can be used to measure a great variety of educational objectives. They are adaptable to various levels of learning outcomes, from simple recall of knowledge to more complex levels, such as the student’s ability to:
? Apply principles to new situations
? Comprehend concepts and principles
? Discriminate between fact and opinion
? Interpret cause-and-effect relationships
? Interpret charts and graphs
? Judge the relevance of information
? Make inferences from given data
? Solve problems
The difficulty of multiple-choice items can be controlled by changing the alternatives, since the more homogeneous the alternatives, the finer the distinction the students must make in order to identify the correct answer. Multiple-choice items are amenable to item analysis, which enables the teacher to improve the item by replacing distractors that are not functioning properly. In addition, the distractors chosen by the student may be used to diagnose misconceptions of the student or weaknesses in the teacher’s instruction
2. Validity. In general, it takes much longer to respond to an essay test question than it does to respond to a multiple-choice test item, since the composing and recording of an essay answer is such a slow process. A student is therefore able to answer many multiple-choice items in the time it would take to answer a single essay question. This feature enables the teacher using multiple-choice items to test a broader sample of course content in a given amount of testing time. Consequently, the test scores will likely be more representative of the students’ overall achievement in the course
3. Reliability. Well-written multiple-choice test items compare favorably with other test item types on the issue of reliability. They are less susceptible to guessing than are true-false test items, and therefore capable of producing more reliable scores. Their scoring is more clear-cut than short-answer test item scoring because there are no misspelled or partial answers to deal with. Since multiple-choice items are objectively scored, they are not affected by scorer inconsistencies as are essay questions, and they are essentially immune to the influence of bluffing and writing ability factors, both of which can lower the reliability of essay test scores.
4. Efficiency. Multiple-choice items are amenable to rapid scoring, which is often done by scoring machines. This expedites the reporting of test results to the student so that any follow-up clarification of instruction may be done before the course has proceeded much further. Essay questions, on the other hand, must be graded manually, one at a time
The most serious disadvantage is the limited types of knowledge that can be assessed by multiple choice tests. Multiple choice tests are best adapted for testing well-defined or lower-order skills. Problem-solving and higher-order reasoning skills are better assessed through short-answer and essay tests.
Multiple choice tests are often chosen, not because of the type of knowledge being assessed, but because they are more affordable for testing a large number of students.
Another disadvantage of multiple choice tests is possible ambiguity in the examinee’s interpretation of the item. Failing to interpret information as the test maker intended can result in an “incorrect” response, even if the taker’s response is potentially valid. The term “multiple guess” has been used to describe this scenario because test-takers may attempt to guess rather than determine the correct answer. A free response test allows the test taker to make an argument for their viewpoint and potentially receive credit.
In addition, even if students have some knowledge of a question, they receive no credit for
knowing that information if they select the wrong answer and the item is scored dichotomously. However, free response questions may allow an examinee to demonstrate partial understanding of the subject and receive partial credit. Additionally if more questions on a particular subject area or topic are asked to create a larger sample then statistically their level of knowledge for that topic will be reflected more accurately in the number of correct answers and final results
Another disadvantage of multiple choice examinations is that a student who is incapable of answering a particular question can simply select a random answer and still have a chance of receiving a mark for it. It is common practice for students with no time left to give all remaining questions random answers in the hope that they will get at least some of them right.
Additionally, it is important to note that questions phrased ambiguously may cause test-taker confusion. It is generally accepted that multiple choice questions allow for only one answer, where the one answer may encapsulate a collection of previous options. However, some test creators are unaware of this and might expect the student to select multiple answers without being given explicit permission, or providing the trailing encapsulation options. Of course, untrained test developers are a threat to validity regardless of the item format
1. Versatility. Since the student selects a response from a list of alternatives rather than supplying or constructing a response, multiple-choice test items are not adaptable to measuring certain learning outcomes, such as the student’s ability to:
? Articulate explanations
? Display thought processes
? Furnish information
? Organize personal thoughts
? Perform a specific task
? Produce original ideas
? Provide examples
Such learning outcomes are better measured by short answer or essay questions, or by performance tests.
2. Reliability. Although they are less susceptible to guessing than are true false-test items, multiple-choice items are still affected to a certain extent. This guessing factor reduces the reliability of multiple-choice item scores somewhat, but increasing the number of items on the test offsets this reduction in reliability.
3. Difficulty of Construction. Good multiple-choice test items are generally more difficult and time-consuming to write than other types of test items. Coming up with plausible distractors requires a certain amount of skill. This skill, however, may be increased through study, practice, and experience
Deciding When Multiple-Choice Items Should Be Used
In order for scores to accurately represent the degree to which a student has attained an educational objective, it is essential that the form of test item used in the assessment be suitable for the objective. Multiple-choice test items are often advantageous to use, but they are not the best form of test item for every circumstance. In general, they are appropriate to use when the attainment of the educational
objective can be measured by having the student select his or her response from a list of several alternative responses.
One of the reasons why some teachers dislike multiple-choice items is that they believe these items are only good for measuring simple recall of facts. This misconception is understandable, because multiple-choice items are frequently used to measure lower-level objectives, such as those based on knowledge of terms, facts, methods, and principles. The real value of multiple choice items, however, is their applicability in measuring higher-level objectives, such as those based in comprehension, application, and analysis.
Checklist for Reviewing Multiple-Choice Items
1. Has the item been constructed to assess a single written objective?
2. Is the item based on a specific problem stated clearly in the stem?
3. Does the stem include as much of the item as possible, without including irrelevant material?
4. Is the stem stated in positive form?
5. Are the alternatives worded clearly and concisely?
6. Are the alternatives mutually exclusive?
7. Are the alternatives homogeneous in content?
8. Are the alternatives free from clues as to which response is correct?
9. Have the alternatives “all of the above” and “none of the above” been avoided?
10. Does the item include as many functional distractors as are feasible?
11. Does the item include one and only one correct or clearly best answer?
12. Has the answer been randomly assigned to one of the alternative positions?
13. Is the item laid out in a clear and consistent manner?
14. Are the grammar, punctuation, and spelling correct?
15. Has unnecessarily difficult vocabulary been avoided?
16. If the item has been administered before, has its effectiveness been analyzed?
Item writing checklist
1. Is the item clear and concise?
2. Did you use the active voice?
3. Did you avoid “ould” words?
4. Is the difficulty level acceptable?
5. Does the stem pose a question or an incomplete thought?
6. If you used blanks, are they at the end of the stem?
7. Does the stem focus on a significant or important aspect?
8. Did you emphasize the NEGATIVES?
9. Have you avoided keying the answer in the stem?
10. Are the distractors plausible?
11. Is there only one arguable correct response?
12. Are the foils homogeneous?
13. Did you avoid overlapping foils?
14. Are numerical foils in either ascending or descending order?[supanova_question]