Split Plot Analysis
Case Study
This case study on factors affecting the lifetimes of electronic components is used to introduce the basic ideas of split-plot analysis.
It starts with an inappropriate analysis that ignores the plot structure, notes why the structure makes that analysis inappropriate and proceeds towards an appropriate analysis.
The first step is to recognise the distinction between analysis of factors that vary at the whole plot level and analysis of factors that vary at split plot level, within the whole plots. The former may be implemented by a straightforward randomised blocks analysis. The latter is facilitated by designating the block effect to be a random effect, which produces separate levels of chance variation, one appropriate to the whole plot analysis and a second appropriate to the split plot analysis.
Understanding the distinctions between different levels of chance variation and their implications for interpretation of the analysis of variance tables, specifically, the definition of the F-ratios calculated in the table, requires some knowledge of the expected values of the mean squares in the table. Minitab provides formulas for these expected means squares which, when suitably interpreted, shed light on the matter. They also provide a basis for calculating the components of variance corresponding to the various sources of variation.
The basic interpretations of the analysis of variance table indicate the relevant summary graphs and tables that summarise the conclusions of the experiment in a format suitable for client reports.
Standard diagnostics are available for assessing the validity of the standard assumptions. In this case, all seems well.
However, there appears to be evidence of departure from assumptions usually associated with split plot designs. These are examined and commented on.
The study
Electronic components are baked in an oven at a set temperature for a set time. Two factors thought to influence the life times of the components were the oven temperature and the bake time. Trial settings for these factors were chosen as follows:
Oven Temperature (T), °F, 580, 600, 620, 640,
Baking time (B), min, 5, 10, 15.
To save on costly runs, three components were baked together at each temperature, with one withdrawn at each of the set times. This plan was replicated 3 times. The results are shown in Table 1.
Sources of variation
The response variable
Lifetime,
is affected by variation in 2 treatment factors,
Temperature of Oven (T), with 4 levels, 580, 600, 620, 640, and
Baking Time (B), with 3 levels, 5, 10, 15,
Table 1 Results of accelerated life time tests
for electronic components
Baking Time (min.)Replicate / Temperature
of Oven (°F) / 5 / 10 / 15
1 / 580 / 217 / 233 / 175
600 / 158 / 138 / 152
620 / 229 / 186 / 155
640 / 223 / 227 / 156
2 / 580 / 188 / 201 / 195
600 / 126 / 130 / 147
620 / 160 / 170 / 161
640 / 201 / 181 / 172
3 / 580 / 162 / 170 / 213
600 / 122 / 185 / 180
620 / 167 / 181 / 182
640 / 182 / 201 / 199
as well as variation between
Replicates (R),
and
chance variation.
A possible analysis
If this was a standard fully randomised experiment, with each of the 4×3 = 12 treatment combinations being run in random order, independently within each replication, then it would constitute a straightforward randomised blocks experiment, with replicates, presumably run at different times, acting as blocks.
A suitable model for this analysis would include all main effects and 2-factor interactions, including replicates as blocks, with the three way interaction of Temperature, Baking Time and Replication used to estimate random error, that is,
T + B + R + T*B + T*R + B*R + e.
Using Minitab to fit this model (entered in the Model window as above, without the e, which Minitab includes automatically) leads to
Analysis of Variance for Lifetime, using Adjusted SS for Tests
Source DF Seq SS Adj SS Adj MS F P
T 3 12494.3 12494.3 4164.8 17.16 0.000
B 2 566.2 566.2 283.1 1.17 0.344
R 2 1962.7 1962.7 981.4 4.04 0.045
T*B 6 2600.4 2600.4 433.4 1.79 0.185
T*R 6 1773.9 1773.9 295.7 1.22 0.362
B*R 4 7021.3 7021.3 1755.3 7.23 0.003
Error 12 2912.1 2912.1 242.7
Total 35 29331.0
The principal conclusions from these (partial) results are that
the main effects of changing Temperature are highly statistically significant,
the main effects of changing Baking Time are not statistically significant,
there is no statistically significant interaction between the two factors.
Supplementary conclusions are that
blocking appears to have been effective, although not necessary to arrive at the principal conclusions,
there is a highly statistically significant interaction between Baking Time and Blocks (Replications).
The latter conclusion, if confirmed, is unexpected and needs explanation. (This issue will be returned to below).
An appropriate analysis for Temperature effects
The analysis shown above is inappropriate because, at each level of Temperature, the same oven set-up was used for the three Baking time levels. Thus,
in the case where the 12 treatment combinations are run in random order, as per the above analysis, the oven temperature must be re-set between each run (unless, by chance, 2 successive runs are at the same Temperature),
while
in the case of the actual experiment described at the outset, the oven temperature is set up once at each Temperature level, and 3 runs at different Baking times are completed before the oven is re-set.
This restricted set-up of the actual experimental means that the chance variation between the three runs within each Temperature set-up is likely to be less than the chance variation between runs at different Temperatures. In particular, for comparing variation between Temperature settings, variation within Temperature settings does not provide an appropriate measure of chance variation.
For the purpose of finding an appropriate measure of chance variation to use as a basis for assessing Temperature effect (a denominator for F), one solution is, first, to average the Lifetimes within each Temperature level and, second, to analyse the summarised data appropriately. Averaging the Lifetimes within each Temperature level ignores both the Bake Time effects and the chance variation around each within-Temperature mean, that is, the within-Temperature chance variation. Variation between runs at different temperatures, replicated, remains and is available as a basis for calculating an estimate of variance appropriate to serve as a reference for a measure of variation between Treatment levels, that is, a denominator for the corresponding F-ratio.
The original data table with the within-Temperature averages (means) added follows.
Baking Time (min.)Replicate / Temperature
of Oven (°F) / 5 / 10 / 15 / Mean
1 / 580 / 217 / 233 / 175 / 208.3
600 / 158 / 138 / 152 / 149.3
620 / 229 / 186 / 155 / 190.0
640 / 223 / 227 / 156 / 202.0
2 / 580 / 188 / 201 / 195 / 194.7
600 / 126 / 130 / 147 / 134.3
620 / 160 / 170 / 161 / 163.7
640 / 201 / 181 / 172 / 184.7
3 / 580 / 162 / 170 / 213 / 181.7
600 / 122 / 185 / 180 / 162.3
620 / 167 / 181 / 182 / 176.7
640 / 182 / 201 / 199 / 194.0
As the individual measurements are irrelevant to the analysis of Temperature effects, this may be reduced to
Replicate / Temperatureof Oven (°F) / Mean
1 / 580 / 208.3
600 / 149.3
620 / 190.0
640 / 202.0
2 / 580 / 194.7
600 / 134.3
620 / 163.7
640 / 184.7
3 / 580 / 181.7
600 / 162.3
620 / 176.7
640 / 194.0
A more conventional display is
Temperature / Replicateof Oven (°F) / 1 / 2 / 3
580 / 208.3 / 194.7 / 181.7
600 / 149.3 / 134.3 / 162.3
620 / 190.0 / 163.7 / 176.7
640 / 202.0 / 184.7 / 194.0
If we regard Replicates as Blocks, this may be regarded as a randomised blocks layout. The formal analysis of variance for randomised blocks was introduced in Lecture 1.2. For these data, the results from Minitab are as follows.
General Linear Model: Life versus R, T
Factor Type Levels Values
R fixed 3 1, 2, 3
T fixed 4 580, 600, 620, 640
Analysis of Variance for Life, using Adjusted SS for Tests
Source DF Seq SS Adj SS Adj MS F P
R 2 654.24 654.24 327.12 3.32 0.107
T 3 4164.77 4164.77 1388.26 14.09 0.004
Error 6 591.31 591.31 98.55
Total 11 5410.32
S = 9.92736
The conclusions to be drawn from this analysis are broadly the same as the corresponding conclusions from the earlier, inappropriate, analysis, i.e.,
the effect of changing Temperature is highly statistically significant,
blocking appears to have been effective.
The corresponding F-ratios are somewhat smaller, and that corresponding to Replication (blocking) is not statistically significant at the conventional 5% level. The explanation for this will emerge shortly.
There is a much closer numerical correspondence between corresponding sums of squares and mean squares in the two analysis of variance tables; those in the first table are 3 times those in the second. For this purpose, note that the Error sum of squares in the second table corresponds to the R*T interaction in the first.
The factor of 3 arises because the within-Temperature means are means of 3 individual measurements. Recall that the variance of a mean of n measurements (square of the standard error), is 1/3 the variance of an individual measurement. The mean squares in the analysis of variance table are all "sample" variances based on the means.
Exercise: Confirm this numerical correspondence.
Note that the averages used in the second Minitab analysis must not be rounded, as in the summary data table above, to ensure exact correspondence. For the purposes of producing the second analysis of variance above, the means were calculated in Excel and copied with full accuracy into Minitab. Using the rounded means in the data table will not give an accurate correspondence.
Interpreting R*T interaction as Error
The correspondence between the R*T interaction sum of squares in the first analysis of variance table with the error sum of squares in the second recalls the assumption that the Block by Treatment interaction amounted to chance variation when analysing randomised blocks in Lecture 1.2 and Laboratory 1. Unfortunately, in both of those examples, there was graphical evidence to the contrary, a fact that was not evident to the original analysts.
In this case, there is some slight suggestion of Block*Treatment interaction, as shown in the following Interaction Plot.
The pattern of parallel replicate profiles evident at the higher temperatures for all three replicates is broken by Replicate 3 at the lower temperatures, with mean lifetime for Replicate 3 low at temperature 580 and high at 600.
These exceptions correspond to the deleted residuals at ±2.8, approximately, in the Diagnostic Plot below.
These residual values are probably not sufficiently exceptional to warrant exclusion. While including them may give a somewhat conservative estimate of standard deviation, excluding them would probably result in two small a standard deviation. Thus, it seems sensible to regard the variation in the diagnostic plot as due to chance. Equivalently, we accept the assumption of no Block*Treatment interaction.
Split Plot Analysis
The randomised block analysis provided us with information on the main effect of one treatment factor, Temperature. For this purpose, we focussed on experimental units which were combinations of smaller units. Within each of the combined units, Temperature did not vary between these smaller units, so that variation at that level was irrelevant for comparing Temperature levels and, therefore, for assessing Temperature effects.
However, the second treatment factor, Baking time, did vary between these smaller units. To get information on the second treatment factor, we need to make comparisons between such units. To do this, we need to make use of the split unit structure in the experiment.
The combined units are referred to as Whole units (or Whole plots in an agricultural setting). The smaller units are referred to as Split units (or split plots). A split plot design is essentially hierarchical, with part of the design, here the randomised blocks design for assessing Temperature, implemented at the whole plot level and the rest implemented at the split plot level.
Correspondingly, there are two basic Components of Variance, one at the Whole plot level and a second, typically smaller, at the split plot level. When interactions come into play, there will be other corresponding components of variance.
Replication (Blocking) as a Random Effects Factor
Minitab facilitates a split plot analysis by designating the blocking factor at the higher level to be a random effects factor.
In agricultural field trials, this is easily envisaged, where the variation in fertility from one physical block of experimental plots to another physical block may be substantial, but unpredictable, and so may be regarded as the outcome of a selection from a Normal "population" of fertility levels, with an appropriate standard deviation, sB, where B stands for Blocking.