The Use of Quasi-experimental Design in Urban and Regional Policy Research and Political Economy

By

Thomas E. Lambert, PhD

Assistant Professor of Public Administration

MPA Program

Northern Kentucky University

Highland Heights, KY 41099

Phone: 859-572-5324

and

Michael Bewley, PhD

President and CEO

Enalysis

9007 Laughton Lane

Louisville, KY 40222

Phone: 502-727-1201

Abstract

There are times when due to a lack of data or the impossibility of random assignment of cases, a researcher is limited in the use of the usual statistical and experimental methods to assess a particular intervention or “treatment” given to subjects or to a target group or region. An assessment technique often used is quasi-experimental design, whereby although random assignment does not occur, threats to validity are reduced by comparing cases which are as similar as possible. One group becomes a quasi-experimental group which has received some form of “treatment” whereas another is a comparison group which has not received the treatment. Such a research design is necessary when certain economic events occur or when economic development projects or new policies are undertaken in urban and regional economies, and there exist no two sub-regions which are exactly the same for the purposes of evaluating the effect of the events, projects or policies. Quasi-experimental design offers a solution for assessing the impacts of different urban and regional phenomena.

JEL codes: B4 Economic Methodology; R1 General Regional Economics

Keywords: research methods, urban and regional economic policy

Introduction

Often in the name of economic development or local economic revitalization, tax incentives and/or regulatory relief are granted to existing business firms within a region so as to spur business expansion, investment and hiring, or at the least, to keep an existing business from removing its capital, laying off its workers and relocating to another jurisdiction. Likewise, such incentives and/or relief can be offered to potential new firms thinking about locating to a jurisdiction (or to a specific, targeted part of a jurisdiction) in order to entice these firms to locate within the region. Again, the goal of the local government(s) is to generate local economic growth through direct job creation and investment by current or new firms.

Economic development incentives and/or regulatory relief (hereafter, “incentives”) offered by municipal or state governments or both are some form of tax expenditures or cost, the burden of which is shifted to other taxpayers. Therefore, good public policy requires or warrants some type of accountability for the effectiveness of the incentives (Rivlin 1971, Kettl 2011). Some common tools used in public policy to assess policy programs and their financial effectiveness are benefit-cost analysis and cost-effectiveness analysis (Weimer and Vining 2011, Dunn 2012, Kraft and Furlong 2012)as well as various statistical techniques such as regression, testing differences of means, etc. Some authors believe that program evaluation is a low priority for most policy makers for various reasons—a lack of research funding, the difficulty of getting programs approved much less implemented and later assessed, etc. (Kettl 2011, Kraft and Furlong 2012). This is one big obstacle that makes program evaluation difficult in addition to other factors.

Experimental methods similar to those in the natural sciences are often used when the random assignment of subject or target populations or regions into experimental and control groups is possible and would be useful in evaluating the efficacy of public policy programs (i.e., “treatments”, or “X” such as economic development incentives offered to firms) in a way that allows for comparisons between the two groups. Through a “pre-test” assessment, inventory of data, or set of observations (O1) that is done to create a base line measurement and random assignment of participants, equivalent groups are formed where the only difference during the experiment is the treatment or stimulus received by the experimental group. This eliminates any threats to internal validity (discussed below) so that differences between groups at the end of the experiment that are noted in a post-test (O2) and can be attributed to the experimental stimulus, and hence, causality can be inferred. This is a two-group pretest and posttest design (Cook and Campbell 1979, Chapter 3):

Experimental group: O1 X O2

------

Control group: O1 O2

Many different examples of experimental design exist. In farming, new pesticides or growing techniques could be tested with crops in one area of a farm whereas the traditional/usual method could be used in another area. In education and the military, experimental design could be employed when pilot tests of new curriculum or other school or training innovations are tried. Students or trainees with the same backgrounds and capabilities could be randomly assigned into experimental (new curriculum) or control (placebo or old curriculum) groups to see if a new training or teaching technique is more efficacious than the one currently being used (Cook and Campbell 1979). In experimental economics, human subjects are often randomly assigned to experimental and control groups in order to gauge the impact of some type of stimulus on participant’s decision making or a group’s decision making when it comes to markets, competitive games, or cooperation, among other situations (Smith 2008). For example, to mimic decision making under uncertainty in market situations, one group of participants may be asked to bid on a certain item (stock, real estate, etc.) while being told certain basic information about the item whereas another group may be told the same information but one additional item which is hypothesized to possibly sway or influence purchase decisions, such as a piece of property was once owned by a famous person or has certain sentimental value, even though this should connote no additional value in a competitive, fully rational market. In economic research that involves public policy evaluation, experimental design has been claimed to test the effects of changes in unemployment compensation, workmen’s compensation benefits, changes in different social insurance programs, the impact of negative income taxes, taxation on alcohol and drinking behavior, earnings of immigrants (Meyer 1995, Institute for Research on Poverty 2013).

When it comes to human subjects, experimental design is not without controversy. Access denied to pilot testing of an innovative training program or curriculum due to one being assigned to a placebo or control group can have harmful educational or career effects to control group members. Even if warned of such adverse consequences, and even if participants grant informed consent, ethical and moral issues abound in the use of experimental methods with human subjects. For this reason, they are often avoided when the stakes are relatively high in human subject research (Babbie 2013).

Also, random assignment is often not possible or practical when it comes to human subjects or other entities such as geographic regions, not only because of ethical considerations, but also because the need for program evaluation is not well conceived or thought out prior to program implementation. Often in the case of urban and regional economic development, political considerations and priorities play a role in which certain regions are targeted for economic development incentives because they may have a disproportionate number of unemployed, poor, or people living in blighted neighborhoods with high crime rates (Blair 1995). Because of the urgency or need to solve such problems, a similar region or locale which does not receive such incentives is not considered a priori as a control group for program evaluation later, or, for example, if all regions below a certain per capita income level are eligible and all receive incentives, it may be difficult if not impossible to find a similar region or locale which does not receive such incentives for purposes of comparison later. It would also not be just for policy makers to create a program and then deny it to certain regions for purposes of an experiment anyway. However, the lack of random assignment and exact control groups cause a potential threat to validity in later studies since any regions receiving tax incentives or stimulus may not exactly match any control groups chosen later for purposes of assessment.

In a study of enterprise zones in New Jersey, Boarnet and Bogart (1996) had a somewhat quasi-experimental situation in which around 28 different cities within the state had applied for or were eligible for enterprise zone (EZ) status. Some were applied for and were accepted for program incentives and EZ designation and chose to participate (7 zones), some were not accepted by state officials for program participation despite applying (14 areas), and some were eligible to participate but chose not to participate (7 areas). Whether a region participated seemed somewhat akin to a random process, and after performing econometric analyses, the authors did not find that EZ programs had much impact. Similarly, and using zip code level data, Bondonio and Greenbaum (2007) did a sample of EZ programs from 10 states and the District of Columbia over a 10 year period and compared various economic trends and outcomes to comparable non-EZ zip codes. Pre-test and post-test measurements and regression were used in a study which tried to come as close to an experimental design as possible, although the authors admit that randomized experiments are impossible in EZ evaluations.

In research that mimics to a certain degree quasi-experimental design, the federal government allows airports in the US to “opt-out” of having US Homeland Security Transportation Security (TSA) Administration personnel perform screening of luggage and passengers as long as such private security companies meet TSA security and performance requirements. There are 25 approved airports for participation, and as of 2012, 16 were actively participating (GAO 2012). The comparisons and differences between the private and government security screening efforts have sparked much debate, not only about airport security privatization but also about the accuracy of the methods used for the comparisons (Frank Kernodle Associates 2006).

Another series of papers using quasi-experimental design to evaluate local urban and regional economic policy involved studies on the Louisville-Jefferson County, Kentucky EZ by Lambert (1997), Cummings and Lambert (1997), Lambert and Coomes (2001), and, Lambert and Nelson III (2002). These research efforts and papers were made possible by grants from the Louisville Board of Aldermen and were instrumental in policy debate and influencing decisions in the state of Kentucky with regard to EZ policies and their continuance (Office of the State Budget Director 2002, Richards-Hill 2003). They are summarized below in order to illustrate quasi-experimental design implementation. However, first a discussion external and internal validity is needed.

Validity—External and Internal

The validity of an experiment or quasi-experiment refers to events or occurrences that could undermine valid conclusions from the experiment. One wants to know if what is being tested is also being measured accurately before, during, and after the experiment. There are two types of validity—internal and external. Internal validity refers to whether the experiment is set up appropriately and has appropriate controls to prevent outside interference that could confound experimental results and their interpretation (Babbie 2013). Threats to internal validity from outside events include the following:

¡  Maturation. These are the developments within subjects which act as a function of the passage of time. For example, if an experiment lasts a few years, such as a new curriculum innovation in a school, most participants may improve their performance regardless of treatment or exposure to the new curriculum.

¡  Selection. These are the biases which may come about in the selection of control or comparison groups. Random assignment of group membership is a counter-attack against this threat.

¡  Sample mortality or subject(s) attrition. Loss of experimental/ test subjects as time passes.

¡  Testing. The effects of taking a test on the outcomes of taking a subsequent test. Improvements in test scores may come about not so much to treatment effects but because participants become better at taking the same test repeatedly.

¡  Instrumentation. A change in the assessment instrument. For example, giving a pre-test, and then a completely different post-test which may be easier or more difficult than the pre-test.

¡  History. Between the introduction of the treatment and the measurement of outcomes, other events can intervene. For example, in local and regional policy, new governmental programs (new job training programs for residents, grants for small businesses in the area) can be introduced as other incentives are underway.

¡  Regression toward the mean. If during the random selection, worse or extreme subjects are chosen for either group, any improvement by the experimental group will look good and tend toward any average.

External validity is also known as the concept of “reliability.” That is, research findings can be replicated in other experiments, perhaps by other experimenters, in other tests with different subjects or with different regions. The biggest threat to external validity would therefore be the inability to replicate findings anywhere else, and this is often because of the fact that experiments using human participants often use small samples obtained from a single geographic location or with idiosyncratic features (e.g. volunteers). These samples are often not representative of a larger population about which inferences are desired. Because of this, one cannot be sure that the conclusions drawn about cause-effect relationships do actually apply to people in other geographic locations or to people without these features. With regard to regional policy, it is hard or impossible to replicate treatments or incentives and make inferences about local and state government policies to other jurisdictions in the US, and so any inferences about causality has to be limited to only certain similar jurisdictions. For this reason, studies of EZ policies and incentives were mostly limited to one state (e.g., California (Dowall 1996) or New Jersey) or one city–county area (e.g., Louisville and Jefferson County, Kentucky).

For quasi-experiments, selection is the greatest threat to internal validity. Therefore, great care has to be exercised in choosing control or comparison groups in trying to get as close of a match as possible to the treatment group. Also, external validity or reliability is obviously a concern, although careful consideration of the results would warrant that limitations to study outcomes are necessary. This perhaps limits the inferences of quasi-experimentation and its usefulness, but in the absence of the possibility of using a regular experiment, it is a viable alternative.