Assessing Higher-Order Thinking in Large Introductory Science Classes

Richard F. Yuretich

Department of Geosciences

University of Massachusetts

611 N. Pleasant St.

Amherst, MA 01003-9297

Engaging students sufficiently in the subject matter to stimulate critical thinking and higher-order reasoning is a challenge in many teaching situations, but the difficulties are often most acute in introductory-level science courses. The large classes in lecture halls often found in colleges and universities complicate the effort even further. Active-learning methods, such as cooperative in-class activities, and on-line quizzes or homework with rapid evaluation and feedback, help to promote higher-level reasoning. In addition, multiple-choice exams can be modified to include questions involving analysis, synthesis and evaluation of diagrams, situations and data. Exams that include cooperative components help align the assessments with the educational strategies, and these also provide opportunities to exercise critical thinking. Data from student performance, surveys, and interviews, confirm the efficacy of these techniques.

Introduction

Many science and mathematics classes at large universities are taught as lectures in auditoriums to large numbers of students, often more than 200. Even at smaller institutions, introductory-level classes of 50 to 100 students are not unusual. Such classes evolved from a perception that information transfer from professor to student is essentially independent of the setting in which it occurs. Large lectures are viewed favorably by administrators as efficient instructional vehicles, and departments find that such classes help boost the FTE count in their programs. In many cases, these large lectures are divided into smaller discussion or laboratory sessions that are supposed to promote greater engagement on the part of the student. However, laboratories are often decoupled from the lecture, so that the connections don’t carry across. In addition, many colleges and universities are scaling back the laboratory requirement in science to help address budgetary difficulties and the only opportunity for supervised learning may be during class time. We may deplore these trends based on educational principles, but the reality of modern higher education is that the large lecture is here to stay.

At the same time, many faculty members will assert that one of their goals in teaching is to encourage students to “reason like scientists” or “think critically,” qualities which require some nurturing, encouragement, personal guidance, and careful assessment. Clearly large classes pose special challenges to teaching and learning at higher levels of "critical thinking." Is it possible to adjust our instructional strategies in such a way that students in large lecture classes can move beyond learning “just the facts?”

Bloom [1] has organized the types of learning behaviors into a hierarchical classification (Table 1). Lectures tend to focus on the first (information) and second (comprehension) levels, which are relatively easy to attain in this mode. Assessment of the students’ abilities at these cognitive levels can be achieved easily through traditional examinations. If the instructional goals include learning at higher levels, which encompass the generic term of “critical thinking,” then assessment methods need to be aligned with these goals.

Table 1. Bloom’s taxonomy of learning levels and some skills demonstrated at each level.

Competence

/

Skills Demonstrated

Knowledge (Information)

/ list, define, tell, describe, identify, show, label, collect, examine, tabulate, quote, name, who, when, where, etc.
Comprehension / summarize, describe, interpret, contrast, predict, associate, distinguish, estimate, differentiate, discuss, extend
Application / apply, demonstrate, calculate, complete, illustrate, show, solve, examine, modify, relate, change, classify, experiment, discover
Analysis / analyze, separate, order, explain, connect, classify, arrange, divide, compare, select, explain, infer
Synthesis / combine, integrate, modify, rearrange, substitute, plan, create, design, invent, what if?, compose, formulate, prepare, generalize, rewrite
Evaluation / assess, decide, rank, grade, test, measure, recommend, convince, select, judge, explain, discriminate, support, conclude, compare, summarize

The results and experiences I summarize here are based on more than five years of concerted experimentation with various teaching methods in a large-enrollment course in introductory oceanography at the University of Massachusetts [2}. Each semester there are more than 600 students enrolled in two lecture sections taught by the same instructor. Each section meets for 75 minutes twice weekly, and the two classes are taught back-to-back in the schedule. There is usually one teaching assistant for each section, and there are no laboratory or discussion sessions. This is a general-education course that is primarily for first- or second-year students who are not majoring in science or math. In fact, this may be the only course in the physical sciences that these students take in their college careers. The challenge has been to make this course an effective learning instrument for the majority of students enrolled, who come from a wide variety of backgrounds and preparation in science, and to engage them in the type of careful reasoning that characterizes scientific investigation.

Methods

In-class Exercises. Lecturing alone does not usually promote higher-level information processing, and, consequently, other active-learning strategies during class are encouraged [3]. In-class exercises are one way that critical thinking skills can be introduced into large classes. Questions can be designed so that through discussion and cooperative-learning methods, students are forced to process the information before reaching a conclusion. Figure 1 is an example of an exercise used in the oceanography course, which requires students to synthesize and evaluate information that they have gathered from the readings and lectures [4]. In the oceanography class, students do these exercises as “think-pair-share”, where they contemplate the answers, discuss them with their neighbors, and then the entire class reviews the answers together. In this particular exercise, the students have watched a short video segment about the Gulf Stream, so these questions also serve as a way to focus their attention on the substantive parts of the video.



On-line Interactive Quizzes. There are many supported platforms now available for students to complete on-line homework assignments or quizzes. These systems have the flexibility to ask more involved questions than just multiple-choice, and to have these graded with feedback to the students. The ability to ask more questions, give feedback, and then have the students repeat the quiz or homework, provides excellent opportunities for critical thinking, in-depth analysis, or higher-order processing of data. In the example provided (Fig. 2) students use a variety of critical-thinking skills. Questions 1 and 2 emphasize comprehension, where students must interpret the information on the graph. Question 4 also requires graphical interpretation, but they must relate and connect the diagram to the process of tidal cycling. Accordingly, this tests their ability at analysis and application. Questions 3 and 5 are questions involving quantitative reasons. In addition, rather than have a list of choices, the student must enter the correct words or phrases in the answer box. If the answers are not correct, the program will give some guidance on possible answers. Calculation questions are especially valuable on-line because the numerical values change with each succeeding attempt. If a student gets an incorrect answer the first time, the feedback will display the correct answer. When she or he then does the quiz again, the numerical values in the question will change. For example in question 3 of Fig. 2, the wavelength (L) will be a different value during a subsequent iteration. The student must be able to solve a formula to obtain the desired answer. Question 5 involves more than just “plug and chug” into an existing formula. The student must be able to understand the basic principle of the tidal cycle and the different kinds of tides in order to obtain the correct answer. Again, if the wrong answer is given, the re-take will come back with a different time for the high tide. Here they are integrating ideas and calculations in order to formulate an answer, skills that match the “Synthesis” level of Bloom’s taxonomy (Table 1).

Re-inventing the Traditional Examination. Multiple-choice examinations are often called “objective” tests, with the implication that they are only useful for assessing lower-order or fact-based knowledge. In large classes, machine-scored exams may be the only practical solution to routine assessment, so that if we are truly devoted to engaging students in higher-order learning, then the multiple-choice exam needs to be adapted to these goals. There are two strategies that work:

1.  Writing questions that require application, analysis, synthesis or evaluation. These are not as difficult to compose as they might seem, as illustrated in the following examples:

On the following “continent”, choose the letter corresponding to the place where coastal upwelling will most likely to occur:

A

(continent)

subtropical North subtropical

gyre B C gyre

D

This question tests the students’ abilities to apply and analyze. They must have a basic knowledge of the facts about ocean circulation, the Coriolis effect and how these interact in the coastal region. Then they need to analyze the patterns and apply them to this abstract situation, which they have not seen before.

In the situation illustrated in the diagram in Fig. 3, what will happen over the course of time?

A)  Sand will accumulate at locations 1 and 2.

B)  Sand will erode from locations 1 and 2.

C)  Sand will accumulate at location 1 and erode at location 2.

D)  Sand will erode at location 1 and accumulate at location 2.

In this question, the students need to synthesize and evaluate. They must first interpret the diagram as a representation of a coastline with groins, and then determine the prevailing direction of longshore drift. Then they must evaluate the impact of the groins on the movement of beach sand, and decide what the likely outcome will be.

2.  Changing the nature of the multiple-choice exam so that students can learn from the examination process

This may seem like a tall order, but it can occur with the strategy known as the pyramid exam [5], which we have adapted successfully for use in our large introductory oceanography course. The essence of the pyramid exam is that students re-take an exam one or more times, working in successively more collaborative settings to complete the test. In the original design, a very difficult exam is repeated several times throughout the course until the entire class works together to solve the final most challenging problems. We use an adaptation called the two-stage cooperative exam, in which students do a multiple-choice exam twice . The first go-round is a traditional test, and students fill out optical scanning forms with their answers. However, after they hand in the answers, they are given new answer sheets and they re-take the test, discussing their answers with other students. These two parts are done during the same class period. For grading purposes, we take 75% of their individual scores and add them to 25% of the cooperative scores to arrive at a grade. The cooperative exam raises the class average between 3 and 5 points, but the most important part is that it turns the much-maligned multiple-choice exam into a learning experience. Because students discuss the questions, answers, and logic or principles behind the questions, they are analyzing, synthesizing, and evaluating the topics, and they are thereby employing higher-order learning skills.

Evaluating the Assessments

The critical question remains, as always, "How do we know that these methods are effective?" The evidence in this case comes from multiple measures: analysis of student performance, surveys and interviews.

Exam Performance. The class as a whole has been improving in the numerical scores on exams. The most recent class outperformed its predecessors by a wide margin on all in-class examinations save the first one (Table 2). The differences among the exams are significant at the 99% level, owing to the large sample population and ensuing degrees of freedom. Cooperative exams were administered in 1998 and 2002, and the chart compares the results only on the solo portions of the exam. There is a progressive increase in scores during the entire period that the active-learning techniques were being introduced, but the incremental increase in exam scores for the most recent semester is the largest. Although the University contends that the student body is more capable now than in the past, I can’t say that this is obvious from my own experiences in the classroom. I therefore conclude that the modifications to the teaching of the course have had an impact on student learning.

Table 2. Comparison of examination results from several years in the oceanography course. 1998 and 2001 data are from the solo portion only of the collaborative exam; 1993 and 1996 are traditional exams. Highest grades are in bold type.

2001 / 1998 / 1996 / 1993
Exam 1 / 70.4 ± 15.8 / 71.1 ± 13.8 / 73.1 ± 13.2 / 71.5 ± 14.6
Exam 2 / 79.9 ± 11.6 / 77.8 ± 12.5 / 71.5 ± 14.5 / 68.5 ± 13.4
Exam 3 / 74.5 ± 13.0 / 70.1 ± 12.8 / 75.8 ± 12.6 / 75.0 ± 11.9
Exam 4 / 80.4 ± 12.0 / 75.8 ± 13.0 / ---- / ----
Final Exam / 80.9 ± 11.5 / 77.9 ± 12.4 / 71.9 ± 12.4 / 74.8 ± 12.1
Overall / 77.1 ± 13.6 / 74.5 ± 13.3 / 73.0 ± 13.3 / 71.6 ± 15.3

Details of comparable final exam questions show that the greatest improvement occurred when active-learning strategies were incorporated for the first time in 1998 (Fig. 4). Although there was improvement in students’ abilities to answer all questions, of particular note are the positive changes in Comprehension (C), Application (App) and Synthesis (S) questions. A prior analysis of the data showed that there was equal improvement in the class’ answers to questions from earlier in the semester as well as more recent topical material, indicating that their ability to retain the subject matter had been enhanced [2]. The 2001 class showed incremental improvement over 1998, although there are no obvious trends or preferences related to the type of question as classified according to Bloom’s Taxonomy. There are differences among the individual questions, but the teaching strategies have obviously matured to the point where large changes would not be expected.