electronic assessment (e-assessment)

Get Complete Project Material File(s) Now! »

CHAPTER 3 Literature study: multiple choice questions (MCQs) in e-assessment

Chapter 2 was a literature study that commenced with a discussion on assessment, before specifically focusing on e-assessment. Related definitions, features, components, benefits, and disadvantages and constraints associated with e-assessment were presented. Thereafter, a brief discussion was presented on the international adoption patterns associated with e-assessment. One of the key features associated with e-assessment, as outlined in Section 2.5.3, is that e-assessment primarily adopts multiple choice questions (MCQs) for the assessments designed.
Hence, this chapter concentrates on this feature – MCQs. Section 3.1 is dedicated to detailed aspects of these questions. In Sections 3.1.1 and 3.1.2 respectively, the benefits and drawbacks of MCQs are overviewed. Since MCQs are sometimes regarded as suitable only for lower level thinking, Section 3.1.3 discusses how MCQs can be created to test higher order thinking skills (HOTS). A comprehensive discussion on the varying types of MCQs, together with examples of each, is presented in Sections 3.1.4 and 3.1.5.
Section 3.2 shifts the focus to the criteria essential for inclusion in a framework, for evaluating e-assessment systems, under consideration for adoption or evaluation. This framework subsequently provides the foundation for the fourth study in the series of studies conducted in this research (See Chapter 6, Section 6.1) which contribute to the development of SEAT and finally e-SEAT.

Multiple choice questions (MCQs)

e-Assessments are dominated by standard multiple choice questions (MCQs), which generally have the format of a prompt followed by a small set of responses from which students are expected to select the best option (Scalise & Gifford, 2006). The prompt/test item consists of a question or stem, the correct answer, and a set of distractors. MCQs have proven to be efficient in measuring students’ achievement and are adopted internationally, both for assessment and diagnostics (Mitkov, Ha & Karamanis, 2006). The reason for the widespread adoption of MCQs is threefold: they should efficiently cover the educational content; they have a high degree of reliability; and they are easy to score (Kadhi, 2004;Roberts, 2006). The previous chapter pointed out that e-assessment can play valuable roles in both formative and summative assessment, and this is particularly true for MCQs.
MCQs offer the best assessment tool for large numbers of students and where limited resources are available to educators (Bani-Ahmad & Audeh, 2010), since a large number of tests can be corrected automatically (Costagliola & Fuccella, 2009) within a short period of time (Mendes, Curto & Coheur, 2011; Ventouras et al., 2010). Early adoption of MCQs took place in medical assessments and frequent use occurs in computer programming courses (Byrnes & Ellis, 2006). Medical education was a pioneer of electronic testing, due to the fact that as students progress in their medical studies they spend more time on practical procedures off-campus, and it becomes increasingly difficult to gather them in a single venue for a test. More recently, MCQs are widely adopted in medical, dental and allied health education due to their ability to assess a broad range of knowledge in a short period of time, and are thus favoured over methods such as short answer, essay and oral examination formats (Ware, Kattan, Siddiqui, & Mohammed, 2014).
Multiple choice testing is recommended for testing factual recognition, which represents only the basic level of professional competence. As a consequence, standard multiple choice testing has limited applicability in modern competency-based education (Karl et al., 2011) and in assessing critical thinking. Nevertheless, standard MCQs are commonly adopted in many domains and are used as primary assessment tools (Beullens, van Damme, Jaspaert & Janssen, 2002). However, newer and improved MCQ formats and techniques have emerged (Osika, 2006; Prestera, Clariana & Peck, 2005) and are addressed in detail in Section 3.1.5, along with examples.
If constructed properly and written well, MCQs can be an effective tool for assessing skills and knowledge in students. Written badly, they can be confusing and demotivating (Alton, 2009). Educators generally find that it is more difficult to create MCQs, despite the range of styles of MCQs that can be adopted. This difficulty is often due to inexperience. Furthermore, the process of generating plausible, yet definitely wrong, distractors is complex and time-consuming (Mitkov & Ha., 2003).
MCQs offer various advantages in addition to the benefits of e-assessment outlined in Section 2.5.7. Section 2.5.7 considered the benefits of e-assessment in general, while the next section, Section 3.1.1, is dedicated to MCQs in particular. Similarly, Section 3.1.2 addresses the distinct drawbacks of MCQs.

Benefits of MCQs

MCQs offer the possibility of covering a broad set of topics (Mendes, Curto & Coheur, 2011), since the questions are usually short.
Although writing good quality MCQs for e-assessment can be time-consuming, benefits received are not just in terms of student performance and automated marking in routine assessment, but also in time saved by academics released from creating and arranging aegrotats or supplementary assessments for students who did not take a test or examination due to a valid reason.
MCQs can be drawn from standardised item banks, which in some disciplines and topics, are freely available, but these questions may need to be contextualised to meet the needs of the students’ local linguistic features and the concepts being assessed (Luckett & Sutherland, 2000). In most cases, the question banks are custom-built by the educators presenting a particular module.
Item analysis and item response theory (IRT) allow educators to evaluate the quality of their MCQs in terms of difficulty and discriminative capacity (Costagliola & Fuccella, 2009). Provided that MCQs have been correctly formulated by the educator, selection of the correct option requires specialised knowledge on the part of students, a grasp of detail, quick responses, and decision-making skills, taking into account that, in some cases, specified time durations might be predetermined for answering the set of questions (Ventouras et al., 2010). Well-designed MCQs offer the further cognitive advantages of lending themselves to the verification of knowledge, comprehension of concepts, and achievement of course objectives (Costagliola & Fuccella, 2009).
After the assessment, the student may print the results, see his/her final score and his/her mistakes, if adequate feedback is provided (Ventouras et al., 2010). Students appreciate timely feedback that is well-presented and simple to understand. This feedback assists their revision for further study (Malau-Aduli, et al., 2013).
Scoring/marking of assessments of MCQ format is objective (Ventouras et al., 2010), since it is free from bias and distortional effects, such as emotional judgments (Costagliola & Fuccella, 2009). Ventouras et al. (2010) indicate that multiple choice questions (MCQs) provide higher reliability and are as valid as constructed-response questions.
Essentially, they are useful for quickly identifying a student’s understanding of a field, but are also useful for revision purposes via formative assessment, because of the speed and accuracy with which they can be assessed, especially through automated marking systems (Elstein, 1993; Farthing, Jones & McPhee, 1998).

Drawbacks associated with MCQs

Despite their widespread adoption, the value of MCQs in educational spheres has often been criticised as they are viewed as tests of factual recall, as has been mentioned in Section 3.1.2. Many MCQs do not assess application of knowledge for problem solving, partly because of the way in which questions are constructed. Due to the somewhat artificial way in which they are marked, whereby a single answer needs to be selected, this form of testing is considered to be unnatural (Engelbrecht & Harding, 2003). In the real world, one is never faced with a problem and five possible solutions, with one viewed as exclusively correct. Furthermore, MCQs have been widely criticised as being artificial and unrelated to authentic practice, since reasoning and problem-solving skills are difficult to measure in MCQ format. According to Fenderson et al. (1997: 526), they tend to focus on ‘recall of trivia’.
Although MCQs have been adopted since the early 1900s, considerable skill, care and practice is required when writing them, to avoid confusing students with unanswerable questions or poor alternative answers (Alton, 2009). Engelbrecht and Harding (2003) point out that if a question is not clear to students, they may select an incorrect option due to a misleading question and not due to insufficient knowledge. One of the main challenges in constructing the MCQ test item is the selection of plausible distractors which will better distinguish confident test takers from unconfident ones (Mitkov et al, 2006).
Incorrect selection is not always the result of a student’s lack of knowledge or understanding, or confusion. It is possible that students have a correct understanding but make minor errors in selection (Fenderson et al., 1997). Hence, some researchers hold that MCQs should make provision for partial credit, without treating minor and major mistakes as equal. A word of caution though – ‘if an assessment always provides credit for partially correct answers, students can pass the entire module without having understood any concepts fully’ (Engelbrecht & Harding, 2003: 57).
Despite the convenience offered by the MCQ format, such assessments can be regarded as low on validity if they assess trivial knowledge only (Luckett & Sutherland, 2000).
MCQ assessments sometimes encourage ‘poor attitudes toward learning and incorrect inferences about its purposes … for example, that there is only one right answer, that the right answer resides in the head of the educator or test maker, and that the job of the student is to get the answer by guessing’ (Bleske-Rechek, Zeug & Webb, 2007: 94). Some theorists argue that MCQs presume that complex skills can be decomposed and decontextualised, which is not always possible. Rather, MCQs rely on well-structured problems with algorithmic solutions. Thus students have the impression that knowledge is additive rather than integrative (Bleske-Rechek, Zeug & Webb, 2007; Scalise & Gifford, 2006).
MCQs pose a limitation on the kind of questions that might be adopted. The nature of MCQs judges the student solely on the correctness of the answer he/she chooses and not based on the method used for reaching the answer (Ventouras et al., 2010). Moreover, they do not allow the educator to investigate in depth, whether the topic which a specific question addressed, has been fully understood or not (Ventouras et al., 2010).
Guessing allows a student to obtain partial scores in the final score, by answering questions by chance, without possessing knowledge of the questioned material. By guessing, it is usually possible to get some questions right. Without negative marking, students gain marks for correct answers and lose none for omissions (Ventouras et al., 2010). Certain systems offer an option for mixed-scoring – that is, negative and positive marking – whereby students gain marks for correct answers and also lose marks for incorrect answers. A study by Ventouras et al. (2010) showed that students are less willing to answer such questions when compared to MCQs based only on positive scoring rules (Bleske-Rechek, Zeug & Webb, 2007). Thus mixed-scoring rules might induce a ‘hampering’ effect to the student, dissuading him/her from tackling a question for which he/she may possess an intermediate level of knowledge.
Writing good MCQs is not an easy task. Creation of good quality questions that assess more than simple learning outcomes can be a time-consuming and labour-intensive process for educators (Mendes, Curto & Coheur, 2011; Pittenger &Lounsbery, 2011). In addition, the updating of questions in the MCQ bank requires much time and effort and often results in inconsistent quality when questions are provided by different educators (Pitenger & Lounsbery, 2011).
Multiple choice assessment approaches are often criticised for not facilitating active learning, because they provide students with a list of choices rather than requiring them to actively identify the correct choice and explain or justify why it is best, as they would be required to do in real life or in a written assessment (Pittenger & Lounsbery, 2011).

MCQs for higher order thinking skills (HOTS)

Research shows that it is possible to test higher-order thinking through well-developed and researched MCQs, but this requires considerable skill, practice and time on the part of the educator (Luckett and Sutherland, 2000; Mitkov & Ha, 2003). As a result of the extra time taken to write high quality MCQs, some educators tend to administer less e-assessments than anticipated in their planning (Catley, 2004).
e-Assessment is often thought of as solely utilising quiz tools (Mostert et al, 2012), implying that it is best suited for recall-type multiple choice type questions. However, the application of MCQs to assess higher order thinking skills (HOTS) is becoming increasingly common. Although in most cases MCQs focus on factual knowledge, they may discerningly be designed to assess HOTS, as well as knowledge. Mostert de Bruyn and Pretorius (2012: 18) indicate that at the University of Pretoria (UP) in South Africa, ‘it has been proven over a period of approximately 20 years … that e-assessment can be used effectively to enhance student learning through assessing on higher level cognitive levels of Bloom’s taxonomy, in different disciplines’.
Essentially, MCQs, in whatever format, have proven to be an efficient tool for measuring students’ achievement but are best used in combination with other assessment methods (Luckett & Sutherland, 1997; Mitkov et al, 2006).
UG Singh: 33609187 Page 69
PhD: e-SEAT Framework

Types of questions supported by e-assessment tools

This section briefly discusses the most common types of questions supported by e-assessment tools. This serves as an introduction to the comprehensive discussion on the many MCQ types described and illustrated with examples in Section 3.1.5.
In general assessment, questions can be classified into two categories, namely, Constructed Response Questions (CRQs) and Provided Response Questions (PRQs).
• CRQs require students to construct their own response to questions posed. This category includes open-ended written questions, essays, projects, short-answer questions (paper-based or online), free-response and paper assignments, that is to say, the forms used in traditional assessments.
• PRQs allow students to choose between a selection of given responses, hence PRQs are best suited for e-assessment. The main criticism is that the rigidity of the marking allocations in PRQs does not allow for flexibility, for example, it is not possible to allocate a proportionate mark.
The most common question types adopted with PRQs are multiple choice questions (MCQs), multiple response questions (MRQs), matching questions, and hotspot questions. However, there is a wider variety of several question types that can be used in online tests, including: multiple choice, true/false, true/false with reason, matching, ordering, fill-in-the-blanks, completing and correcting code, and writing new code (in the context of computer programming) (Byrnes & Ellis, 2006; Costagliola & Fuccella, 2009; Souali et al., 2011). As explained in the introductory section of 3.1, these newer and improved formats and techniques will be addressed in detail in Section 3.1.5, along with examples to illustrate each type.
Most of the questions included in online tests are characterised by closed stimulus and response. Tests including only these questions are called objective tests (Costagliola & Fuccella, 2009). Stressed by many authors, e-assessment supports a variety of objective question types that can be adopted (Alessi & Trollip, 2001; Byrnes & Ellis, 2006; Costagliola & Fuccella, 2009; Engelbrecht & Harding, 2003; Fielding & Bingham, 2003; Horton & Horton, 2003; Khedo, 2005; Lambert, 2004; Maurice & Day, 2004; Sim et al., 2004; Souali et al., 2011). The various formats of MCQs are discussed and illustrated in Section 3.1.5.
The most common question type adopted in e-assessment is straight multiple choice – already described in this chapter – where the student has to choose the correct option from a list of possible answers (Souali et al., 2011). Students respond by pointing and clicking the mouse on the selected answer, by moving objects around the screen, by entering numbers or characters via the keyboard, or by pointing gestures on tablets. The software can react with an appropriate result and, frequently, textual feedback. Some sophisticated programs select subsequent questions based on the previous answers that the student gave (adaptive testing) – setting more demanding questions if answers are correct, or easier questions about the same topic if answers are incorrect (Souali et al., 2011). The incorporation of such questions can strengthen e-assessment, and is addressed in Section 3.1.3, which reports that MCQs can also be adopted for assessing HOTS. Since e-assessment implies the use of computers to deliver assessments, as well as mark and analyse students’ reponses, it should therefore support MCQs as well as short-answer response questions (Mostert et al, 2012).
The key to successful implementation of an e-assessment system as part of any institution’s assessment strategy, is threefold. Firstly, the system’s distinct features should be utilised to their fullest capacity. Secondly, good quality questions and varying question types should be created (Mostert et al, 2012). Finally, satisfactory questions should be stored in a database for reuse.

Different formats of MCQs supported by e-assessment tools

In the following subsections, a brief description is given of each of the varying question types supported by e-assessment tools, followed by an example.
UG Singh: 33609187 Page 71
PhD: e-SEAT Framework

Multiple choice questions or multiple response questions

Both multiple choice questions (MCQs) and multiple response questions (MRQs) display a list of answers from which students have to select the most appropriate answer. MCQs require students to select one option from the list of alternatives provided. e-Assessment systems allow unique ways of presenting these questions since they can be shuffled/randomised so that each student is presented with the questions in a different order, while taking the same assessment (Mostert et al, 2012). Although these are the simplest to answer, they are often assumed to assess merely low-level learning objectives, such as addressing common errors in understanding and testing memorisation of meaningful facts and concepts (Miller, 2012). See Figure 3.1 for an example.
MRQs can allow students to select multiple options when more than one is correct. They have a choice between choosing one, a combination of a few, all, or none of the alternatives (Miller, 2012). It is similar to an MCQ, but allows for more than one option to be selected. This type of question requires students to think more than they would for an MCQ before they respond. Thus, the cognitive level of MRQs is higher than MCQs (Miller, 2012). See the example in Figure 3.2.

Extended Matching Items (EMIs) or Extended Matching Questions (EMQs)

Extended Matching Items (EMIs) or Extended Matching Questions (EMQs) are a variant of multiple choice questions. In an EMI the student selects the best answer to a question from a list of ten to twenty options, each of which may be used once, more than once, or not at all. This variant of multiple choice questions is widely adopted in the medical education field (Beullens, van Damme, Jaspaert & Janssen, 2006). EMIs aim to test the application of knowledge rather than simple recall. The aim of assessments is to get students to apply knowledge rather than simply recall isolated facts. EMI questions take the form of a small problem or short cases called vignettes. ‘In three to six sentences a medical case is described giving various details such as the patient’s symptoms and the results of lab tests, and the student is asked to arrive at a diagnosis’ (Wood, 2003: 2). There may be several questions about the vignette and each answer will be chosen from a long list rather than from just five options. Thus, having read and understood the vignette and the information given in it, the student selects the best answer to each question from a list of up to twenty options, each of which may be used once, more than once, or not at all (Fenderson et al., 1997; Wood, 2003). EMIs were originally developed to assess diagnostic pattern-recognition skills of physicians, but are now used more widely in the medical education field (Case & Swanson, 1993). Although EMIs have been used mainly in the discipline of medicine, they lend themselves to other subject areas as well (Wood, 2003) and are currently incorporated in assessment of a wide variety of subjects.
EMIs, also known as EMQs (Extended Matching Questions) retain the advantages of MCQ tests such as objectivity and automated marking, but also offer the following unique advantages:
• The question format aids in specifying the examination content, for instance, in the medical domain, each major ailment could be used as a theme (Beullens et al., 2002; Case, Swanson & Ripkey, 1994).
• A set of questions on the same theme, facilitates the development of several content-parallel test forms (Beullens et al., 2002; Case, Swanson & Ripkey, 1994).
• EMIs require students to solve small authentic problems rather than recall isolated facts (Beullens et al., 2002; Case, Swanson & Ripkey, 1994; Fenderson, et al. 1997). Thus educators are better able to distinguish well-prepared students from the marginal students. Furthermore, they are valuable in testing core knowledge, because the provision of cues is minimised (Fenderson et al., 1997).
• The structure of EMIs facilitates item writing: the option list flows naturally from the theme and the items from the option list. The homogeneous options and parallel items reduce technical flaws made by academics in phrasing items (Beullens et al., 2002, Case, Swanson & Ripkey, 1994). They are thus easier to prepare than traditional multiple choice tests, because there is no need for plausible distractors. They emphasize real-world problem-solving skills, and are less likely to concentrate on unimportant/low-level aspects (Fenderson et al., 1997).
• The long option list allows inclusion of all relevant options (Beullens et al., 2002; Case, Swanson & Ripkey, 1994). Thus they prevent students from answering by elimination, rather than by actually knowing the answer (Fenderson et al., 1997).
An example two EMI questions in Computing, that relate to the same vignette, is provided in Figure 3.3.

Drop down items

Drop down items are used when there are a large number of options available to the student from which to select. The options are not immediately available when the question is opened. Instead, all the options required to be displayed are added to the drop down list. This question type expects the student to open the list and select only one choice as an answer from the drop down list provided. Several questions might be asked, each of which has its correct answer in the list. An example of this type is presented in Figure 3.4 showing several different questions. Note that the full list is not visible in the figure.

TABLE OF CONTENTS
CHAPTER 1 Introduction and background
1.1 Introduction
1.2 Background and rationale
1.3 Problem statement and purpose of the study
1.4 Research questions and objectives
1.5 Benefits of the study
1.6 Literature study outline
1.7 Scope of the study
1.8 Research design and methodology
1.9 Structure of the thesis
1.10 Chapter summary
CHAPTER 2 Literature study: electronic assessment (e-assessment)
2.1 Definition of assessment
2.2 Types of assessment
2.3 Purpose of assessment
2.4 Measures of assessment
2.5 Electronic assessment (e-assessment)
2.6 Chapter conclusion
CHAPTER 3 Literature study: multiple choice questions (MCQs) in e-assessment
3.1 Multiple choice questions (MCQs)
3.2 Initial criteria for evaluating e-assessment tools/systems
3.3 Chapter conclusion
CHAPTER 4 Research design and methodology
4.1 Research foundations of this study
4.2 Research design
4.3 Research methodology
4.4 Frameworks
4.5 Implementation of action research in this study
4.6 Validity, reliability and triangulation
4.7 Chapter conclusion
CHAPTER 5 Data presentation and analysis of Phase 1 Studies
5.1 Study 1
5.2 Study 2
5.3 Study 3
5.4 Framework evolution
5.5 Chapter conclusion
CHAPTER 6 Data presentation and analysis of Phase 2 Studies
6.1 Study 4
6.2 Study 5: e-SEAT Framework evaluation
6.3 Study 6: e-SEAT Framework application and validation
6.4 Screen prints of e-SEAT
6.5 Chapter conclusion
CHAPTER 7 Conclusion and Recommendations
7.1 Introduction
7.2 Research questions revisited
7.3 Practical and theoretical contributions of this study
7.4 Mixed methods research and action research
7.5 Validity, reliability and triangulation
7.6 Limitations of the research
7.7 Recommendations and future research
7.8 Chapter summary and conclusion
REFERENCES
GET THE COMPLETE PROJECT