Chapter 5: Experimental Design

Introduction

The design of any experiment is of utmost importance because it has the power to be the most rigid type of research. The design, however, is always dependent on feasibility. The best approach is to control for as many confounding variables as possible in order to eliminate or reduce errors in the assumptions that will be made. It is also extremely desirable that any threats to internal or external validity be neutralized. In the perfect world, all research would do this and the results of research would be accurate and powerful. In the real world, however, this is rarely the case. We are often dealing with human subjects, which in itself confound any study. We are also dealing with the restraints of time and situation, often resulting in less than perfect conditions in which to gather information.

There are three basic experimental designs, each containing subsets with specific strengths and weaknesses. These three basic designs include: (1) pre-experimental design; (2) quasi-experimental design; and (3) true experimental design. They will be discussed below and as you will discover, are addressed in order of effectiveness.

Pre-Experimental Design

Pre-experimental designs are so named because they follow basic experimental steps but fail to include a control group. In other words, a single group is often studied but no comparison between an equivalent non-treatment group is made. Examples include the following:

The One-Shot Case Study. In this arrangement, subjects are presented with some type of treatment, such as a semester of college work experience, and then the outcome measure is applied, such as college grades. Like all experimental designs, the goal is to determine if the treatment had any effect on the outcome. Without a comparison group, it is impossible to determine if the outcome scores are any higher than they would have been without the treatment. And, without any pre-test scores, it is impossible to determine if any change within the group itself has taken place.

One Group Pretest Posttest Study. A benefit of this design over the previously discussed design is the inclusion of a pretest to determine baseline scores. To use this design in our study of college performance, we could compare college grades prior to gaining the work experience to the grades after completing a semester of work experience. We can now at least state whether a change in the outcome or dependent variable has taken place. What we cannot say is if this change would have occurred even without the application of the treatment or independent variable. It is possible that mere maturation caused the change in grades and not the work experience itself.

The Static Group Comparison Study. This design attempts to make up for the lack of a control group but falls short in relation to showing if a change has occurred. In the static group comparison study, two groups are chosen, one of which receives the treatment and the other does not. A posttest score is then determined to measure the difference, after treatment, between the two groups. As you can see, this study does not include any pre-testing and therefore any difference between the two groups prior to the study are unknown.

Table 5.1: Diagrams of Pre-Experimental Designs

Quasi-Experimental Design

Quasi designs fair better than pre-experimental studies in that they employ a means to compare groups. They fall short, however on one very important aspect of the experiment: randomization.

Pretest Posttest Nonequivalent Group. With this design, both a control group and an experimental group is compared, however, the groups are chosen and assigned out of convenience rather than through randomization. This might be the method of choice for our study on work experience as it would be difficult to choose students in a college setting at random and place them in specific groups and classes. We might ask students to participate in a one-semester work experience program. We would then measure all of the students’ grades prior to the start of the program and then again after the program. Those students who participated would be our treatment group; those who did not would be our control group.

Time Series Designs. Time series designs refer to the pretesting and posttesting of one group of subjects at different intervals. The purpose might be to determine long term effect of treatment and therefore the number of pre- and posttests can vary from one each to many. Sometimes there is an interruption between tests in order to assess the strength of treatment over an extended time period. When such a design is employed, the posttest is referred to as follow-up.

Nonequivalent Before-After Design. This design is used when we want to compare two groups that are likely to be different even before the study begins. In other words, if we want to see how a new treatment affects people with different psychological disorders, the disorders themselves would create two or more nonequivalent groups. Once again, the number of pretests and posttests can vary from one each to many.

The obvious concern with all of the quasi-experimental designs results from the method of choosing subjects to participate in the experiment. While we could compare grades and determine if there was a difference between the two groups before and after the study, we could not state that this difference is related to the work experience itself or some other confounding variable. It is certainly possible that those who volunteered for the study were inherently different in terms of motivation from those who did not participate. Whenever subjects are chosen for groups based on convenience rather than randomization, the reason for inclusion in the study itself confounds our results.

Table 5.2: Diagrams of Quasi Experimental Designs

True Experimental Design

True experimental design makes up for the shortcomings of the two designs previously discussed. They employ both a control group and a means to measure the change that occurs in both groups. In this sense, we attempt to control for all confounding variables, or at least consider their impact, while attempting to determine if the treatment is what truly caused the change. The true experiment is often thought of as the only research method that can adequately measure the cause and effect relationship. Below are some examples:

Posttest Equivalent Groups Study. Randomization and the comparison of both a control and an experimental group are utilized in this type of study. Each group, chosen and assigned at random is presented with either the treatment or some type of control. Posttests are then given to each subject to determine if a difference between the two groups exists. While this is approaching the best method, it falls short in its lack of a pretest measure. It is difficult to determine if the difference apparent at the end of the study is an actual change from the possible difference at the beginning of the study. In other words, randomization does well to mix subjects but it does not completely assure us that this mix is truly creating an equivalency between the two groups.

Pretest Posttest Equivalent Groups Study. Of those discussed, this method is the most effective in terms of demonstrating cause and effect but it is also the most difficult to perform. The pretest posttest equivalent groups design provides for both a control group and a measure of change but also adds a pretest to assess any differences between the groups prior to the study taking place. To apply this design to our work experience study, we would select students from the college at random and then place the chosen students into one of two groups using random assignment. We would then measure the previous semester’s grades for each group to get a mean grade point average. The treatment, or work experience would be applied to one group and a control would be applied to the other.

It is important that the two groups be treated in a similar manner to control for variables such as socialization, so we may allow our control group to participate in some activity such as a softball league while the other group is participating in the work experience program. At the end of the semester, the experiment would end and the next semester’s grades would be gathered and compared. If we found that the change in grades for the experimental group was significantly different than the change in the grades of our control group, we could reasonably argue that one semester of work experience compared to one semester of non-work related activity results in a significant difference in grades.

Table 5.3: Diagrams of True Experimental Designs

Chapter Conclusion

The experiment, especially the true experimental design is often the measure of choice when attempting to determine a cause and effect relationship. Utilizing randomization and the pre-testing and post-testing of both an experimental group and a control group allows us to control for more confounding variables than any other research method. These confounding variables, when not addressed, can often result in inaccurate findings.

Controlling for confounding variables is important in research and especially important in the experimental designs. This process helps us assure valid results both internally and externally. The threats to internal validity, those that apply to the experimental situation itself, and external validity, those relating to the generalizability of our results to the real world are also issues of great concern to researchers. As the saying goes: garbage in, garbage out. If we start with a flawed design we will end up with flawed results.

As the degree of control for each of the designs discussed increases, the difficulty in performing the research also increases. Feasibility is always an issue and even when the most stringent control is used, the mere fact that the subjects have agreed to participate in the experiment may have a negative effect on the study’s generalizability. Are volunteer subjects truly representative of the population at large? As you can see, there are varying degrees of experimental research, but there is no perfect experiment that controls for all possible variables and assures us of 100% generalizability.