A Concentration Analysis of Student Responses

on the 1995 Version of the Force Concept Inventory

Nicole DiGironimo

University of Delaware

2007 AERA Annual Meeting - Poster Session Conference Paper

Introduction

There have been substantial efforts made towards improving basic physics courses, especially since Halloun and Hestenes' (1985a) survey of calculus based and non-calculus based physics students at the University of Arizona. Their findings were consistent with otherstudies questioning the effectiveness of traditional methods of teaching physics, suggesting that traditional teaching methods were not successful at imparting physics knowledge to students (Arons, 1997; Hake, 1998; Halloun & Hestenes, 1985a; Hestenes, 1979). This prompted the need to improve basic physics courses. Currently, the student testing device most well-known and widely used in physics education research is the Force Concept Inventory (FCI). The FCI is a multiple-choice test designed to categorize students' understanding of basic Newtonian physics concepts (Hestenes et al, 1992). However, like any standardized testing instrument, some researchers critiqued the exam's design and use (Griffiths, 1997; Huffman & Heller, 1995); this prompted the development of a variety of methods for analyzing FCI data. This paper adds to the literature base by reporting on the implementation of a previously developed analysis method on the newest version of the FCI.

A Short History of the Force Concept Inventory

The Mechanics Diagnosis Test was first published in 1985 (Halloun & Hestenes, 1985a). The purpose of the exam was to identify the various levels of student knowledge present in any college introductory physics course. It accomplished this goal with open-answer questions that required the students utilize their physics skills. Subsequent analysis and coding of the student responses to the open-ended questions determined recurring student alternative conceptions. Halloun and Hestenes (1985b), disappointed with the haphazard misconceptions literature of the time, recognized a need for a comprehensive taxonomy of alternative conceptions of Newtonian physics. Their review of the literatureand MDT results uncovered elements of Aristotelian and Impetus theories present in student thinking and they used these findings to build categories of common alternative conceptions(Hestenes et al, 1992). These categories were used to construct a multiple-choice version of the MDT, designed to provide alternative conception distracters to the students in order to draw out the students' physics conceptions. To establish reliability and validity, various versions of the multiple-choice MDT were administered to over 1000 college students, as well as to physics professors and graduate students. The results of these tests were affirmative and physics education researchers began using this effective evaluation tool immediately.

Successes with the multiple-choice version of the MDT led to the development of the FCI in 1992, another multiple-choice test with distracters that represented the common alternative student conceptions. Only a few improvements were made to the MDT to create the FCI; half of the FCI questions wereMDT questions. Most of the changes were to the language used in the test questions. Validity and reliability tests were not repeated for the FCI because the FCI scores were comparable with the MDT scores and the FCI was designed as an improvement to the MDT (Hestenes et al., 1992; Savinainen & Scott, 2002). Although the FCI, like the MDT, probed students’ beliefs about Newtonian concepts of force and motion, its main purpose was to "evaluate the effectiveness of instruction" (Hestenes & Halloun, 1995, p. 502). As education researchers' interests evolved from basic instruction improvement to student conceptual understanding, cognition, and epistemology, the analysis and use of the FCI evolved as well. In 1995, the current version of the FCI was developed as a revision to the 1992 version; it has 30 multiple-choice items, compared to 29 on the original (Halloun et al., 1995).

Theoretical Framework

Research shows that students enter an introductory physics course with pre-defined physics beliefs (Halloun & Hestenes, 1985a). The literature also indicates that instructor and textbook authority alone are not enough for students to dismiss their common sense physics alternative conceptions (Halloun & Hestenes, 1985a; Pintrich et. al, 1993). These pre-defined beliefs, or concepts, are a system used to explain the physical world. The theoretical framework used in this paper to conceptualize student belief systems follows Ioannides and Vosniadou's (2002) framework theory and diSessa and Sherin's (1998) concept system.

The framework theory is defined as a relatively well-established theory, or “an explanatory system with some coherence” (Ioannides & Vosniadou, 2002, p. 4), about the physical world that begins when we are very young and is complete by the time we start at school. Ioannides and Vosniadou state that the "framework theory is based on everyday observations and information provided by the culture, as this information is interpreted by the human cognitive system” (Ioannides & Vosniadou, 2002, p. 4). Ioannides and Vosniadou's study involved young children's ideas about 'force' and the authors claimed that,

if there is a framework theory that guides children's interpretation of the word force, then we should expect children to answer questions about force in a relatively uniform and internally consistent manner. If not, we should expect logically inconsistent responses guided by a multiplicity of fragmented interpretations of the meaning of force. (Ioannides & Vosniadou, 2002, p. 5)

This definition is especially useful for analyzing the FCI.As FCI data are analyzed, patterns in the student's answers will provide insight into their common ideas about Newtonian physics. If a student, or a group of students, consistently answers questions that probe the same physics topic correctly, then, using a framework theory foundation, one could conclude a coherent belief system. Researchers commonly refer to incorrect pre-defined belief systems as misconceptions or alternative conceptions. An interesting fact is that the most common alternative conceptions, Aristotelian and Impetus theory, were, in Pre-Newtonian times, advocated by scientists (Halloun & Hestenes, 1985b). Instructors, therefore, should not only have a way to identify their students' alternative conceptions but they should also consider all alternative conceptions seriously. Each alternative conception should be considered a valid student hypothesis and physics courses should be structured to evaluate the alternative conceptions by scientific procedures. Structuring a course in this way can provide students with the experimental proof, scientific reasoning, and timeneeded to revise their beliefs (Chinn & Brewer, 1993; HestenesHalloun, 1985a; Ioannides & Vosniadou, 2002).

diSessa and Sherin (1998) provided insight into the cognitive aspects of this research. Their paper tackled the difficult task of defining 'concept' in the context of conceptual change and understanding. Their definition did not describe a 'concept' as a singular idea or as a small group of ideas, but rather, like Ioannides and Vosniadou, they described the model of a concept "more like a knowledge system" (diSessa and Sherin, 1998, p. 15). It is a student’s comprehension of Newtonian topics that defines his/her basic knowledge system, or conceptsystem, of Newtonian physics.

A concept system derived from personal experience and very little formal training will differ distinctly from the Newtonian knowledge system of a trained physicist (Bransford et al, 1999).diSessa and Sherin claimed that, "instead of stating that one either has or does not have a concept, we believe it is necessary to describe specific ways in which a learner's concept system behaves like an expert's - and the ways and circumstances in which it behaves differently" (diSessa & Sherin, 1998, pp. 15-16).All basic physics courses cover the Newtonian theory of physics. Newtonian theory enables us to identify the basic elements in the conceptualization of motion. The kinematical elements are position, distance, motion, time, velocity, and acceleration. The dynamical elements are inertia, force, resistance, vacuum, and gravity. These topics were chosen for inclusion in the FCI for their ability to illuminate the differences between Aristotelian, Impetus, and Newtonian thinkers. It is this difference between expert (Newtonian) and novice understanding that the FCI attempts to bring to light through its well-designed distracters; this difference is also evident in the results of the analyzed FCI exams in this study.

Purposes of this Study

Two of the standard methods used to reveal useful information from the FCI scores were developed and implemented by Hake (1998) and by Bao and Redish (2001). Bao and Redish's method is called a Concentration Analysis, which measures student response distributions on multiple-choice exams. They applied their method to the 1992 version of the FCI. The purpose of this study was to use the concentration analysis on the 1995 version of the FCI.

The theory behind the concentration analysis is, if students havewell-defined ideas about the subject being tested, and if the multiple-choice options represent these common alternative conceptions as distracters, then student responses should be concentrated on the appropriate distracters for the physics concept defined in the student’s mind (Bao & Redish, 2001; Ioannides & Vosniadou, 2002). As already stated, the FCI is intended to create this exact situation; the way in which a student responds to each question should yield some information about their alternative conceptions, or lack thereof.

After applying the concentration analysis to the FCI exams, the main purpose of this study was to use the concentration analysis data to investigate the students' responses with the hope that patterns would reveal themselves. Bao and Redish (2001) claimed their concentration analysis could determine if students who take the FCI possess common correct or incorrect physics concepts and, therefore, allow one to determine if the FCIis effective in detecting the students’ physicsconcepts. This study set out to authenticate the first part of Bao and Redish's claim. The latter claim was beyond the scope of this study.

Methodology: The Concentration Analysis

To understand the concentration analysis, first consider an example where 100 students answer the same multiple-choice question, choosing between choices A, B, C, D, or E. Bao and Redish (2001) maintained that the student responses will correspond to one of these three types of outcomes, illustrated in Table I.

A type I response pattern represents an extreme case where all the responses are evenly distributed across all of the choices.
A type II pattern represents a more typical situation where there is a higher distribution on some choices than on others.
A type III pattern is another extreme case where every student has selected the same answer, presumably, although not necessarily, the correct answer.

Table I

Possible Distributions for a Multiple-Choice Question

Choices
Type of Pattern / A / B / C / D / E
I / 20 / 20 / 20 / 20 / 20
II / 35 / 10 / 50 / 0 / 5
III / 0 / 0 / 100 / 0 / 0

Note. Adapted from "Concentration Analysis: A Quantitative Assessment of Student States," by L. Bao and

E.F. Redish, 2001, AmericanJournalofPhysics, 65(7), p. 45.

The concentration factor, C, is a function of student responses. This function can take on values between [0,1], where a 1 represents a Type III perfectly correlated pattern and a 0 represents a Type I pattern. The concentration factor is calculated for each exam question. When using the equation,m represents the number of choices for the question (for the FCI, this number is always equal to 5), N is the number of students who answered the question, and ni is the number of students who selected choice i. Student response patterns are formed bycombining the question's concentration factor with the question's score, the percentage of students who answered a particular question correctly. Like the concentration factor, the score is a continuous valuewith a range of[0, 1]. Bao and Redish created a coding scheme, illustrated in Table II, to label the student response patterns.

Table II

Coding Scheme for Score and Concentration Factor

Score (S) / Level / Concentration Factor (C) / Level
0~0.4 / L / 0~0.2 / L
0.4~0.7 / M / 0.2~0.5 / M
0.7~1.0 / H / 0.5~1.0 / H

Note. Adapted from "Concentration Analysis: A Quantitative Assessment of Student States," by L. Bao and E.F.

Redish, 2001, AmericanJournalofPhysics, 65(7), p. 50.

Although they are illuminating, the codes for the score and concentration factor carry no great weight on their own. Table III shows how combining the codes for the score and concentration factor provide the student response patterns for each multiple-choice question. Table III also indicates how each response pattern can be used to interpret the students' understandingof physics, their concept system. These response patterns are the intended products of the concentration analysis.

Table III

Student Response Patterns and Interpretation of the Patterns

Response Pattern / Interpretation of the patterns
One-Peak / HH / One correct concept system
LH / One dominant incorrect concept system
Two-Peak / LM / Two incorrect concept systems
MM / Two concept systems (one correct and one incorrect)
Non-Peak / LL / Three or more concept systems represented somewhat evenly

Note. Adapted from "Concentration Analysis: A Quantitative Assessment of Student States," by L. Bao and E.F. Redish, 2001, AmericanJournalofPhysics, 65(7), p. 50.

The purpose of Bao and Redish's study was to introduce and evaluate the concentration analysis method. The results fromtheir studypresented three conclusions regarding the effectiveness of using a concentration analysis to investigate students' ideas about physics. The first conclusion was that a concentration analysis can help detect erroneous student concept systems, especially when combined with student interviews. The fundamental nature of this analysis is the ability to find patterns in the students' thinking. If a student consistently chooses distracters that represent a particular alternative physics concept, then the instructor or researcher can make some conclusions about the student's physics understanding (Bao & Redish, 2001; Ioannides & Vosniadou, 2002). When followed by interviews, the student'sconcept systems can be more easily identified.

The second conclusion was that a concentration analysis helps to identify test questionswith ineffective distracters. This outcome would be identified by an LL response pattern, which indicates that none of the available distracters are particularly eye-catching for a majority of the students. This could happen if none of the distracters reflect a common student concept; however, a lot of research went into the development of the FCI and it is unlikely that any of the tests questions lack the necessary distracters. Two other, more likely, explanations are that there is no common student concept system for the context of the question or all the choices correspond well with student concept systems and the students are using all the concept systems equally.These possibilities would indicate that the group of students lack a strong understanding of the subject matter; either the students are clueless and are guessing at the answer (causing a nearly random distribution of student responses) or the students lack the experience needed to properly categorize the problem and solve it using the appropriate methods. Bao and Redish suggested that LL responses indicate a need for additional research (Bao & Redish, 2001).

The last conclusion Bao and Redish made was a purely practical one: the results of a concentration analysis could be used in test construction. For all of the reasons already mentioned, a concentration analysis provides useful information regarding how students interpret the exam questions. To be a diagnosis exam, a concentration analysis should reveal high concentrations and low (or high) scores. These results indicate effective distracters and coherent student concept systems.

Sample

The target population for this study is all students who could potentially take the FCI; however,the sample was drawn from an assessable population (Gall et. al, 2003). The sample of students was the entire introductory physicscourse at a private, urban university. However, only 22 of the 41 students enrolled in the course chose to participate in the voluntary study. One student took the FCI but did not sign the consent form and, accordingly, was not included in the analysis. This response is equivalent to 51.2% of the class.

The test was taken voluntarily and anonymously; therefore there are no demographics for the sample. However some information is known about the entire class. There were 12 freshman (29.3%), 21 sophomores (58.5%), 3 juniors (7.3%), and 1 senior (4.9%) in the introductory physics course. The majors represented by the students include: mechanical engineering, computer science, teacher education, mathematics, and economics. The majority (22%) of the students were computer science majors. 80.5% of the class was male. This information could be used in an analysis of the results but it was not used in this study.

Results and Discussion

This section contains two tables containing relevant data from the study. Table V lists the score, concentration, and response pattern for each question on the FCI. The revealing response patterns are color-coded in Table V to increase its utility. The actual distributions of the students' responses for each FCI question are in Table IV. The following sections of this paper investigate some interesting themes discovered in the data. First, there were seven instances of LL response patterns in the FCI data. It is always illuminating to discuss the possible meanings behind LL student response patterns; this may not turn out to be true for this study, however. The second finding, as indicated in the literature on expert and novice understanding (Bransford et al., 2000), the data demonstrate examples of student miscategorization of FCI questions. The third and fourth themes discovered within the data are examples of sample-wide understandings and misunderstandings of particular physics topics. The practical implications of these themes are discussed later in this paper. Finally, the students' total scores on the FCI are presented with a discussion of the meaning and implications of this data.

Table V

Score and Concentration for each FCI Question

Q1 / Q2 / Q3 / Q4 / Q5 / Q6 / Q7 / Q8 / Q9 / Q10 / Q11 / Q12 / Q13 / Q14 / Q15
S / 0.67 / 0.24 / 0.57 / 0.71 / 0.24 / 0.86 / 0.71 / 0.60 / 0.38 / 0.57 / 0.33 / 0.67 / 0.43 / 0.62 / 0.57
C / 0.51 / 0.16 / 0.30 / 0.58 / 0.15 / 0.75 / 0.52 / 0.38 / 0.18 / 0.30 / 0.21 / 0.51 / 0.26 / 0.38 / 0.34
MH / LL / MM / HH / LL / HH / HH / MM / LL / MM / LM / MH / MM / MM / MM
Q16 / Q17 / Q18 / Q19 / Q20 / Q21 / Q22 / Q23 / Q24 / Q25 / Q26 / Q27 / Q28 / Q29 / Q30
S / 0.85 / 0.14 / 0.24 / 0.38 / 0.43 / 0.35 / 0.76 / 0.42 / 0.67 / 0.33 / 0.19 / 0.48 / 0.67 / 0.86 / 0.29
C / 0.74 / 0.76 / 0.16 / 0.28 / 0.14 / 0.11 / 0.60 / 0.19 / 0.51 / 0.16 / 0.15 / 0.26 / 0.48 / 0.76 / 0.36
HH / LH / LL / LM / ML / LL / HH / ML / MH / LL / LL / MM / MM / HH / LM

Table VI