Chapter 6.6

The Logic and Logic Model of Technology Evaluation

Yong Zhao

Michigan State University

East Lansing, Michigan, USA

Bo Yan

Blue Valley School District

Overland Park, Kansas, USA

Jing Lei

Syracuse University

Syracuse, New York, USA

Abstract: After decades of investment in information technology, the promised revolution in education has yet to come. With the growing accountability pressure, evaluation of educational technology programs is laden with the hope of demonstrating technology does make a positive impact on education, especially student achievement. Troubled by a flawed question, however, this approach to technology evaluation is shown to be simplistic and inadequate for understanding the complex and dynamic relationship between technology and education. This chapter further introduces recent development in evaluation theory and method as well as latest perspectives on technology in education, and discusses their implications for technology evaluation. Based on the discussion, a logic model is proposed to serve as prototype for evaluating technology programs with adaptations and adjustments that are subject to local context.

Key words: Effectiveness, Program evaluation, Logic model

1.Introduction

Evaluation is an unavoidable task for schools as they embark on the journey of adopting technology programs. The various stakeholders all want to have answers to their questions concerning the program. The funding agency, be it the community, a government agency, a private business, or a charitable organization, wants to know if its investment is worthwhile. The school governing board and parents wants to know if the program will help or hinder the education students receive. The teachers and other staff want to know how the program might affect their work. And of course the students, although often overlooked in many evaluation efforts, are entitled to know if and how the program might lead to more effective and efficient learning.

But evaluation is not an easy task. The multiplicity of stakeholders, the complexity of the context where the program is implemented, and the ill-definition of outcomes complicate the matter tremendously. Different stakeholders have different views of “the value or worth” of the same program, which is usually affected by the various factors of the school context and the program itself. Adding to that complexity is the fact that technology is often asked to simultaneously produce many outcomes, some of which are not well defined or lack precise measures. Consequently evaluation has to serve many purposes (U. S. Department of Education, 1998):

-Provide information to funding agencies so they can determine whether to continue the funding or invest in similar programs in the future.

-Provide information to school leaders so they can decide whether to continue the implementation and engage in similar ones in the future.

-Provide information to teachers and school staff so they can decide whether or how they might to support and participate in the program.

-Provide information to program staff so they can take actions to improve the program.

-Provide information for future evaluation efforts.

-Provide information to the general public.

Depending on its purpose, evaluation can be formative or summative. Formative evaluation is intended to provide ongoing information about whether things are proceeding as planned and whether expected progress is being made and if any changes need to be made while the purpose of summative evaluation is to provide information about the program’s overall merit or worth.

A recent evaluation study of technology in education exemplifies the various issues surrounding evaluation. In April 2007, the US Department of Education released a report entitled “Effectiveness of Reading and Mathematics Software Products: Findings from the First Student Cohort”(Dynarski et al., 2007). This document reports findings of an evaluation study that intended to assess the effects of 15 computer software products designed to teach first and fourth grade reading and sixth grade math. It found that “test scores in treatment classrooms that were randomly assigned to use products did not differ from test scores in control classrooms by statistically significant margins.”

As soon as it was publicly released, the report caught the attention of the media. A day after the release, for example, the Washington Post published a report about the study with the title eye-catching title “Software's Benefits On Tests In Doubt: Study Says Tools Don't Raise Scores” (Paley, 2007). The influential education magazine Education Week published an extensive story on this study and reactions a week later under the title “Major Study on Software Stirs Debate: On whole, school products found to yield no net gains.” (Trotter, 2007)

The software industry and supporters of educational technology quickly responded to the study as well, challenging the study from every possible angle. For example, the Consortium for School Networking (CoSN), the International Society for Technology in Education (ISTE), and the State Educational Technology Directors Association (SETDA) issued a joint statement two days after the release, cautioning readers of the report "to scrutinize the findings carefully, as even the U.S. Department of Education states that the study 'was not designed to assess the effectiveness of educational technology across its entire spectrum of uses.'" The Software & Information Industry Association (SIIA) also responded with a statement about the study: "As this study recognizes, proper implementation of education software is essential for success. Unfortunately, it appears the study itself may not have adequately accounted for this key factor, leading to results that do not accurately represent the role and impact of technology in education."(Nagel, 2007).

The treatments this report received reflect the very issues of evaluation of educational technology. First, there is much demand for serious evaluation of technology in schools. Uncertainty about the effectiveness of technology still exists despite (perhaps because of) years of heavy investment and widespread euphoric claims about the power of technology for transforming education. The public, policy makers, and education leaders are in desperate need for sound evidence to guide their investment decisions. Second, evaluation of the impact of educational technology is not easy, to say the obvious. As critics of the study pointed out, there are many factors affecting the effectiveness of any given technology. It is very difficult if not impossible to isolate the effects of technology from those of the uses and the people who use them. Moreover, technology is a catch-all word that can include all sorts of hardware and software. Hence it is unwise but easy to do to over-generalize the results of one or two applications to all technologies. Furthermore, the impact of technology may not necessarily be on test scores. Thus selecting the desirable outcome measure becomes another issue when evaluating the implementation and impact of technology.

2.A Critical Appraisal of the Evaluation Literature

In the short history of studying technology in education, there has never been a lack of evaluation studies that are indented to gauge the impact of technology on education. No matter what form technology took (from Skinner’s teaching machine to radio and television) and in what way it was employed (face-to-face or distance learning), few technologies and technology applications have not been evaluated. When it comes to modern information technologies that are represented by personal computers, the Internet, and a variety of mobile electronic devices, there is no exception. Globally, the movement of infusing information technologies in schools has lasted more than a decade. This continuous investment is accompanied by all sorts of evaluation efforts that have increasingly become an inseparable part of many educational technology programs.

From 1995 to 1998, the European Commission funded the Targeted Socio-economic Research Programme whose focus was specifically on studying the results of those projects that were involved in ICT-supported learning innovations (Barajas, 2003). Case studies were conducted in OECD countries to examine the impact of information technology on teaching and school improvement (Organisation for Economic Co-operation and development [OECD], 2000). The Second Information Technology in Education Study Module 2 (SITES M2), a project of the International Association for the Evaluation of Educational Achievement (IEA) that involved research teams from 28 countries in Europe, North America, Asia, Africa, and South America, took an in-depth look into how curriculum is influenced by technology (Kozma, 2003). In the United States, with the push from the federal government for scientific research, a growing number of studies have been conducted or are under way, which involves randomly assigning research participants into a control group in which people do business as usual and a treatment group in which technology is integrated into curriculum (Dynarski et al., 2007). Although there is a great variety among nations with respect to educational system, technological infrastructure, and resources for technology integration, the evaluation efforts are strikingly similar in terms of what questions they intend to answer and how evaluations are approached and conducted. Across the globe, people seem to care about similar issues and evaluations are engaged with addressing same concerns.

These evaluation efforts, plus countless smaller scale studies conducted by researchers in universities and research institutions, constitute a significant part of the rich repertoire of evaluation on the relationship between technology, learning, and teaching. A review of this literature suggests that while previous evaluation efforts have certainly contributed to our understanding of the role and impact of technology in education, there are a number of fundamental problems that must be addressed if we wish to make evaluation more useful practically and more insightful theoretically.

2.1Flawed Question

In spite of the fact that a wide variety of questions are explored by evaluators, the evaluation efforts of technology and education is largely driven by the so-called “works” question, which often takes the form of “does technology work” or “is technology effective?” Though phrased differently, in essence, there are all concerned about one thing: do students using technologies perform better than those who do not on a certain outcome measure. Studies trying to tackle this question often employ experimental or quasi-experimental design and later become the data source of meta-analysis (Kulik, 2003; Murphy, Penuel, Means, Korbak, & Whaley, 2001; Pearson, Ferdig, Blomeyer, & Moran, 2005). The “works” studies flooded the educational technology literature, especially in a special field of educational technology: distance learning, where the research is dominated by studies trying to answer whether a significant difference exists between distance learning and face-to-face learning (Russell, 1999). The “works” research directly addresses policy makers’ concern about the efficacy of investing money in technology. In addition, it is often perceived by many to be methodologically easy to conduct. However, a careful examination reveals that it is not only theoretically impoverished but also methodologically problematic.

Epistemologically, the question of “does technology work” assumes that technology either produces expected outcomes on some pre-conceived outcome measures or not. This notion is troublesome in two aspects. First, it assumes that the effect of technology is largely invariant or at least consistent regardless of population, specific technologies, and the context in which it is used. However, synthesizing research (Kulik, 2003; Pearson et al., 2005; Waxman, Lin, & Michko, 2003; Zhao, 2005) has repeatedly observed large variation of outcomes existing within the technology treatment groups, which sometimes could be even larger than the between-group variation. These results have led researchers to believe that a dichotomized answer is not attainable and the effect of technology depends on how it is used and for what purposes. Second, in contrast to the varied effects and diverse goals of technology integration, outcomes in the “works” research have been mostly defined and exercised as standardized test scores. On the one hand, this narrowed view deemphasized other expected outcomes that deserve attention. On the other, it prevented us from examining unexpected outcomes that could be equally or even more important.

Methodologically, the question of “does technology work” can not be successfully answered because it in fact encompasses two distinct questions whose answers confound each other. Decades of research suggests that technology could lead to improved learning outcomes if used appropriately (Educational Testing Service, 1998; Fouts, 2000; Wagner et al., 2005). At the same time, there is evidence showing that appropriate technology uses are associated with conditions for technology uses (Dexter, Anderson, & Ronnkvist, 2002; Mann, 1999; Noeth & Volkov, 2004; Penuel, 2006; Zhao, Pugh, Sheldon, & Byers, 2002). Taking these two important findings together, it is not hard to notice that a significant difference in the “works” research is the result of two occurrences. First, certain technology uses are supported in a school environment. Second, those certain technology uses are effective in producing desired outcomes. Thus, “does technology work” is really asking: 1) are certain technology uses effective, which has been extensively studied in the past; 2) can those effective technology uses survive in a certain school environment, which has recently drawn a growing amount of attention. Therefore, a significant effect means not only a certain use is effective but also the use is achievable in a particular school environment. Similarly, a non-significant result could be that the use is realized but not effective or the use is effective but not achievable.

2.2Simplistic Approach to a Complex Phenomenon

The predominance of the flawed research question has several unhealthy ramifications to the evaluation of educational technology. One of them is that researchers are often confined to take a simplistic approach to studying educational technology. After years of exploration, it has been widely recognized that technology integration is a continuing process, which usually involves setting goals, laying out an action plan, implementing the plan, and making adjustments during implementation. The effect of a technology program depends on what happens in the integration process, which has been identified to be non-linear (Molas-Gallart & Davies, 2006) and dynamic (Mandinach, 2005), which could be complex and messy (Zhao et al., 2002).

One of the major weaknesses exhibited in many evaluation studies is the obsession with assessing outcomes in the end. When attention is placed on final results, what happened during the process is often ignored. As a result, all the complexities and richness embedded in the process are lost in the examination of a technology program. Analysis of the final outcomes, no matter how sophisticated and valid, can only provide a technical account of how those numbers which symbolize program effectiveness are obtained. However, what is needed for decision making is a cultural story of what brought us to the results. For example, when evidence unfavorable to technology presents, it is important to know whether technology was appropriately used and what physical and social conditions prohibited the appropriate or inappropriate uses. Without cultural annotations to these questions, a non-significant finding is evidence without context and can be misleading.

The complexity of technology uses as a social phenomenon lies in not only the dynamic process but also in the fact that a technology program often pursues multiple goals. Educational technology researchers have long been aware of the important role objectives and goals play in technology integration (Noeth & Volkov, 2004). A clearly articulated goal could have significant impact on how technological resources are allocated, how technology is used, and how the effect of technology is evaluated. In practice, technology is employed to pursue a variety of goals. Depending on goals of technology integration, outcome measures could be test scores or other measures of performance, college attendance, job offers, as well as various skills such as higher-order thinking skills, communication skills, and research skills (Noeth & Volkov, 2004). They could also be perceptions about implementation benefits, attitudes toward learning, motivation, self-esteem, and engagement levels (Silvin-Kachala & J. & Bialo, 2000).

Given its legitimacy and importance, research that investigates whether technology improves student achievement is still limited. When student achievement is the locus of attention, most studies relied on performance on standardized test as indication of student achievement. Standardized tests have been extensively criticized to be inadequate for the purported function (Popham, 1999). What have worried people even more is that standardized test can not reflect improvement in many traits benefited from using technology, such as higher-order thinking skills, creativity, and broadened horizons. In addition standardized tests are incapable of measuring unexpected and unwanted outcomes. As pointed out by many researchers, while empowering teachers and students by providing them with unprecedented opportunities, technology can also introduce many problems and complex issues into education (Lei, Conway, & Zhao, in press). For example, problems in classroom management, cheating and plagiarism, and access to inappropriate web content have been repeatedly reported in the media and research studies. These negative effects of technology have significant implications to education. However, a narrow focus on student achievement in evaluation efforts has prevented exploration and exposition of these issues.

A comprehensive understanding of technology calls for diversified methods that address different questions. All valid evidence should be treated seriously to equal extent. However, greater, if not exclusive, weight has been placed on experimental or quasi-experimental design that is deemed to provide most valid answers to “works” questions (Mageau, 2004). While acknowledging randomized experiments as the gold standard for causal inferences, there is a general consensus that traditional experimental designs are limited for dealing with the complexity of technology integration (Heinecke, Blasi, & Skerker, 2000; McNabb, Hawkes, & Rouk, 1999; Melmed, 1995). Alternative methods are essential if program effectiveness is to be determined and understood (Slayton & Llosa, 2005).

2.3Inconsequential Policy Implications

Evaluation or research on the impact of technology does not seem to have had much impact on policy. The failure or success of past technology does not seem to have affected the policy decisions about future investment in technology. For example, the heavy investment in connecting schools to the Internet and putting more computers to schools in the 1990s was made despite the failed attempts of introducing other technologies such as the film, radio, and television. Today, while schools have more technology than what was believed to be necessary to realize the dream scenario used to justify the billions of dollars in education technology(McKinsey & Company, 1995), it would be quite difficult to find many meaningful differences in how and what student study in school in today and how and what they studied in school in 1996, the year when major investment in connecting schools to the Internet began in the U.S.. The third National Educational Technology Plan released by the U.S. Department of Education in 2004 quotes Education Secretary Rod Paige: