Analysis of a Completely Randomized Design

Twelve samples of pancreatic tissue were to be tested for the effect of glucose on insulin release. The twelve samples were randomly divided three groups of size four. The three groups were subjected to low, medium, and high levels of glucose, designated 1, 2, and 3, respectively. A dot plot of the three groups is shown below.

By randomizing the assignment of the samples to the treatment groups, we guarantee that the assumptions of independence are met. The plot suggests that the other assumptions of ANOVA (normal distributions, equal variance) are perhaps OK (it is difficult to assess the assumption of a normal distribution with 4 data points) with the possible exception of an outlier in group 1 (low glucose) at about 3.5. Proceeding, the results of Minitab ANOVA are shown below. We asked for Tukey's multiple comparison procedure (which controls the experiment-wise or family-wise error rate), and both a normal probability plot and histogram of the residuals. The plots will be used to assess the assumption of normal distributions.

Worksheet size: 5000 cells

Macro is running ... please wait

One-way Analysis of Variance

Analysis of Variance for insulin

Source DF SS MS F P

glucose 2 10.297 5.148 9.31 0.006

Error 9 4.979 0.553

Total 11 15.276

Individual 95% CIs For Mean

Based on Pooled StDev

Level N Mean StDev ------+------+------+------

1 4 2.2325 0.9514 (------*------)

2 4 3.4375 0.4605 (------*------)

3 4 4.5000 0.7366 (------*------)

------+------+------+------

Pooled StDev = 0.7438 2.4 3.6 4.8

Tukey's pairwise comparisons

Family error rate = 0.0500

Individual error rate = 0.0209

Critical value = 3.95

Intervals for (column level mean) - (row level mean)

1 2

2 -2.6740

0.2640

3 -3.7365 -2.5315

-0.7985 0.4065

Conclusions:

  1. The ANOVA resulted in a P-value of 0.006, which is significant at the usual 0.05 level of significance. We conclude that there is strong evidence for a difference in mean insulin released by the tissue for the different glucose levels.
  2. The individual 95% confidence intervals show substantial overlap. (Note: these confidence intervals are not adjusted for multiple comparisons, but if there is overlap with these intervals, then we will not find significant differences when applying a multiple comparisons method.) They suggest there may be a significant difference between groups 1 and 3 but not between 1 and 2, nor 2 and 3.
  3. Tukey's simultaneous confidence intervals yield the following with 95% (simultaneous!) confidence: -2.6740 < 1-2 < 0.2640, -3.7365 < 1-3 < -0.7985, and -2.5315 < 2-3 < 0.4065. Note that the only interval that does not contain 0 is the one for 1-3. Thus, there is strong evidence that the mean of group 3 is higher than the mean of group 1, but there is not strong evidence for any other differences. This leads to the seemingly contradictory conclusion that we do not have a significant difference between groups 1 and 2, nor between 2 and 3, but there is a significant difference between groups 1 and 3.
  4. The normal probability plot of the residuals suggests that the small values of the residuals are not small enough -- see how the plot bends down from the straight line on the left side. Note also that the axes are reversed from the usual minitab normal probability plot. The histogram also shows a big lump on the left side. These suggest that the assumption of normal distribution may not be valid. As a general rule of thumb, ANOVA still works reasonably well even when the assumptions of equal variance and normal distribution are violated in that the level of significance is still approximately accurate. The procedure is however not as powerful at detecting differences in the means as may be the case with other procedures that make use of more accurate assumptions. Thus, we are reasonably sure that any differences we find are truly significant, but there may be differences that could be found by some other method. The most important assumptions for ANOVA are the independent random samples. The use of a Completely Randomized Design guarantees these assumptions.