1
Quasi-Experimental Design
A quasi-experiment is one where the treatment variable is manipulated but the groups are not equated prior to manipulation of the independent variable. I shall discuss a few such designs here.
Pretest-Posttest Nonequivalent Groups Design
This design looks a lot like the randomized pretest-posttest design that I discussed earlier, but in this case the two groups have not been equated prior to treatment. Since one has not randomly assigned subjects to groups, one cannot assume that the populations being compared are equivalent on all things prior to the treatment, and accordingly internal validity is threatened. When a post-treatment difference between groups is observed, one cannot with great confidence attribute that effect to the treatment, since the groups may have had pre-existing differences that caused the observed post-treatment difference.
In the language of Campbell and Stanley, the threat here is selection and all of the various interactions involving selection. One can try to mitigate the problem by assigning subjects to groups (or selecting intact groups) in ways that make it likely that the groups do not differ greatly prior to the treatment, but one always worries about the unknown variables on which the groups might differ and which might affect the criterion variable.
Trochim has written several pages in which he presents hypothetical examples of this design and then critiques them with respect to which threats to internal validity seem of most concern and which seem unlikely. You should read those pages carefully to get a feel for the sort of thinking that is involved when trying to determine how suspicious one should be of conclusions drawn from research gathered through a pretest-posttest nonequivalent groups design.
So, why would one ever choose to employ this design? Frankly, one should not, if a better design (such as a randomized pretest-posttest design) is feasible, but sometimes you cannot accomplish random assignment, especially when dealing with human subjects (who don’t like to be told what treatments they are or are not going to get) in field settings.
Double-Pretest Nonequivalent Groups Design
This modification of the pretest-posttest nonequivalent groups design helps to control for a Selection x Maturation interaction by including a second pretest. If the groups are maturing at different rates, that difference may appear in the comparison between the first pretest and the second pretest.
Regression-Discontinuity Design
This design looks a lot like the pretest-posttest nonequivalent groups design, but the groups are nonequivalent by choice. The ‘C’ the first column indicates that the subjects are assigned to groups based on their score on the covariate (the pretest).
One usually starts by deciding what the treatment and criterion variables will be. For example, I may decide that my treatment variable will involve an online tutorial program in basic statistics and my criterion variable will be students’ performance in an undergraduate statistics class. I shall offer the program to some students (and entice them to use it) but not to other students. At the end of the semester I shall use scores on the comprehensive final examination as the criterion variable.
So, what do I use as the pretest? I could administer the comprehensive final examination twice, at the beginning of the semester (pretest) and the end of the semester (posttest), but that might just reveal that none of my students know any statistics before they take a statistics class. I might decide to use an alternative sort of pretest, such as a test of statistics aptitude (based largely on verbal and logical reasoning but with a little math thrown in too). I would want scores on this test to be highly predictive of final examination performance -- that is students who do well on this aptitude test (those with good verbal and logical skills, even if they don’t know any statistics yet) will be highly likely to do well in the course, while those who door poorly on the aptitude test will likely have great difficulty with the course.
Now, how am I going to assign subjects to groups? I could just announce the availability of the tutorial program and let anybody use it. I would keep track of who used it and who did not, and at the end of the semester I would compare those two groups in terms of how they performed on the final examination. I would use the pretest scores as a covariate in an ANCOV. Conducted in this fashion, this would be an example of a pretest-posttest nonequivalent groups design. There almost certainly would be pre-treatment differences between those who elected to use the tutorial program and those who elected not to. While some researchers think that the use of ANCOV enables one statistically to remove such a confound, that is not exactly true. Interpretation of the results of an ANCOV where the experimental groups differ on the covariate is very tricky. Even if the ANCOV could remove the confound involving pre-treatment differences on the covariate, the groups probably differ on other important characteristics that could affect the results.
So, I abandon the idea of using a pretest-posttest nonequivalent groups design. I decide instead to use a randomized pretest-posttest design. I use a random number generator to decide who gets to use the tutorial program and who does not. Not long after the start of the semester I start to get angry phone calls from concerned parents who are worried that their children will be at a disadvantage in the class because they are in the deprived (control) group. They explain that being so deprived could lead to them failing the class, or getting a grade that would keep them out of graduate school or cost them a scholarship and so on. I am risking those students’ futures for my stinking research and they are not going to stand for it. After a few of these parents (and/or the students) call up my chair, my dean, the chancellor, the Board of Governors, their legislative representatives, the local media, and so on, and their attorneys phone the university attorney about a lawsuit, I decide that maybe a randomized pretest-posttest design is not feasible.
I might then decide to try a switching-replications design, where half the class gets access to the tutorial program during the first but not the second half of the semester, and the other half get access to it the second but not the first half of the semester. I would, of course, measure performance in the class both on a midterm exam and a final exam. This switch might, however, not satisfy my critics. Those who have to wait until the second half of the semester might rightfully complain that by that point they will already be hopelessly lost in the class without access to the tutorial -- after all, if you don’t have a good understanding of the material covered during the first half the semester (things like means, variance, probability, and inference), you are not likely to be able to understand much of what is taught during the second half of the semester. Also, those who have access to the tutorial the first half of the semester are likely to complain bitterly about having it taken away from them during the second half of the semester. Maybe the switching-replications design is not feasible here either.
How can I possibly assign subjects to my treatment groups in a way that will not upset so many people? One interesting possibility is to apply the treatment only to those students who are most in need of it -- that is, provide the tutoring only to those students whose scores on the pretest indicate that they will probably have a lot of trouble with the course. There still might be some complaints. The more able students may argue that they too could benefit from the tutorial and it is not fair that they be denied just because they have higher aptitude -- but there is an American tradition of favoring the underdogs, of giving “special education” to those with low aptitude but not those with high aptitude, so social pressures would probably cause these whining brainiacs to shut up. I might also get some complaints from the students who were selected to get the tutorial -- “Why do I have to this extra work when those other students don’t, it just isn’t fair!” Well, you know what they say about pleasing all the people all of the time.
I should add that there might be circumstances in which I would want to give the special treatment only to those who were most advantaged. For example, suppose that my tutorial program was designed to be useful only for those who would quickly master the basics of statistics and then be bored to death while I was repeating myself again and again trying to get the rest of the class to understand. A special program that was designed to capture and maintain those highly able students would best be applied only to them -- but doing this might be considered “politically incorrect,” elitist.
You might now be thinking that I have lost my mind, deliberately producing experimental groups that are nonequivalent on a covariate that is known to be highly correlated with the criterion variable. How am I going to eliminate threats to internal validity such as regression to the mean? As you will see, the answer to this question comes from how the analysis is done, and it is not done by simply comparing the two groups on the posttest.
Having chosen a treatment, a criterion variable, and a covariate (pretest), I now must decide on the cutoff (on the covariate) for deciding who gets the treatment and who does not. It might be reasonable to give the treatment to half of the students and not to the other half, but I might not have enough resources to do so. For example, staffing and equipment restrictions might allow me only to provide the treatment to the students in the bottom 25% of the distribution of scores on the pretest.
To illustrate the use of the regression-discontinuity design, I have simulated data from a hypothetical project employing that design. I started by simulating 38 pairs of scores (pretest, posttest) randomly drawn from a population where Post=7+1.35Pre+error, = .9, and . These data are available at this hot link: RegD0.txt. This data file is a plain text file. Each line contains data for one subject. The first score is the letter ‘C’ or ‘T,’ indicating whether that subject was assigned to the control group or the treatment group. Following a blank space (the delimiter), the next score is the subject’s posttest score, and the next is the subjects’ pretest score. I assigned to the treatment group all subjects whose score on the pretest was 6 or less. I defined the effect of the treatment to be exactly zero in the population.
If you remember how to use the statistical package you learned in your statistics course, you can bring these data into that program and conduct a regression analysis on them. Using all 38 data points, you would find that the estimated regression parameters are: Post=7.58+1.27Pre+error, r = .85, and MSE = 2.13. A plot of the data with the regression line drawn in is shown in Figure1 below. On this plot, data points for subjects in the treatment group are plotted with the symbol ‘T’ and those for subject in the control group with the symbol ‘C.’
If you were to conduct two regression analyses on these data, one for the treatment group and one for the control group, you would find slightly different regression lines, but those differences would be due totally to sampling error, because the data for all subject in both groups were randomly sampled from exactly the same population. I have estimated the separate regression lines and plotted them on the scatter plot which is shown in Figure 2 below. For the treatment group the separate regression estimates are Post=8.09+1.17Pre+error, r = .62, and MSE = 2.13 and for the control group Post=6.33+1.43Pre+error, r = .72, and MSE = 2.29. Looking at the plot, you should be able to see the difference between these two regression lines, but they don’t look very different from one another.
Figure 1. Pooled Regression Line for Predicting Post from Pre
With No Treatment Effect.
Figure 2. Separate Regression Lines With No Treatment Effect.
Next I re-simulated the data, but with one change -- I built in a three point treatment effect when defining the population for those in the treatment group. These data are available at hot link RegD1.txt. I have estimated the separate regression lines and plotted them on the scatter plot which is shown in Figure 3 below. For the treatment group the separate regression estimates are Post=11.27+1.07Pre+error, r = .82, and MSE = 1.35 and for the control group Post=7.90+1.18Pre+error, r = .82, and MSE = 1.25. Looking at the plot, you can clearly see the difference in these two regression lines. I extended the control group regression line into the treatment group area to show what we would expect the regression line to look like for the treatment group if there were no treatment effect.
Figure 3. Separate Regression Lines With a Three Point Treatment Effect.
The plot provides pretty convincing evidence that there is an effect of the treatment. The regression line for the treatment group is clearly higher than what we would expect it to be if the treatment were without effect. It would be hard to imagine how one of the threats to internal validity that we have already discussed could have created the observed discontinuity at exactly the cutoff point, but there is another threat that must be considered. What if the true relationship between the criterion variable and the pretest is curvilinear, but we have used a linear analysis? As illustrated in Chapter 11 of Trochim, this can lead one to conclude that there is a treatment effect when in fact there is not.
Proxy-Pretest Design
In this design one gathers the pretest information after the experimental treatment has started. In other words, one finds an archival proxy for the pretest. For example, suppose I ask the following question: “Does completion of PSYC 2210 (experimental psychology) have an effect on a student’s knowledge of statistics?” Ideally I would measure the students’ statistical knowledge at the beginning of the semester, but suppose that the question did not occur to me until the middle of the semester. I might decide to use as a proxy-pretest students’ performance in their PSYC 2101 (statistics) class. My control group might consist of a group of students taking some other class (not 2210). For each student I would obtain a continuous measurement of the student’s performance in PSYC 2101 and, at the end of the semester, a continuous measurement of the student’s knowledge of statistics. ANCOV would be used, with the proxy-pretest serving as a covariate.
Separate Pre-Post Samples Design
In this design the sample of subjects that you use for the pretest is different from the sample of subjects that you use for the posttest. There are several variations of this design. Suppose that I want to evaluate the effect of a tutorial program in my statistics class. My colleague Suzie Q and I each taught statistics both this semester and the previous semester and we each gave our students a standardized test of statistics achievement at the end of the semester (as part of a departmental evaluation of the course). This semester I shall make the tutorial program available to all of the students in my class, but my colleague will not. Again, both teachers administer the statistics achievement test at the end of the semester.
Look at the design notation in the box above. Note that there are four nonequivalent groups. The first line represents my students last semester, when I did not make the tutorial program available. The second line represents my students this semester, when the tutorial program was made available. The third line represents my colleagues’ students last semester, and the fourth line represents my colleague’s students this semester. The potential for selection problems is clearly large with this design.
Nonequivalent Groups Switching Replications Design
Recall our earlier (Chapter 7) discussion of the randomized groups switching replications design. The quasi-experimental version of this design differs in that the comparison groups are not equated by randomization. For example, when evaluating the effect of my experimental statistics tutorial, I could make it available to one class of students during only the first half the semester and to the other class during only the second half of the semester. This might reduce, somewhat, complaints about not getting the special treatment right away, unless the one class learns what is going on in the other class.