Answers for Unit 11

Answers For Unit 11

11.1.1. "One row at a time, cut the data from the table above and paste it into a Minitab worksheet: you will need columns labeled Count, Bottle, Tube, and Sample."

Here we show an alternative procedure, based on Minitab's set command, to get the data into the worksheet (indicated in color). The cut-and-paste method works fine, but there is no way to demonstrate it in these answers.

The rest of the commands, to make subscripts for the main effects, can be used whether the data are put into the worksheet by cut-and-paste or using the set command.

MTB > name c1 'Count'

MTB > set c1

DATA> 1 3 3 2 2 1 5 1 0 3 0 0

DATA> 1 4 2 4 1 1 5 1 1 4 0 1

DATA> 1 2 4 1 3 2 5 1 2 5 4 2

DATA> 1 2 3 1 2 0 3 0 2 1 0 0

DATA> 3 1 3 0 4 2 5 2 2 1 2 3

DATA> 2 3 6 0 6 1 5 0 2 3 1 1

DATA> end

MTB > name c2 'Bottle'

MTB > set c2

DATA> (1:2)36

DATA> end

MTB > name c3 'Tube'

MTB > set c3

DATA> 2(1:3)12

DATA> end

MTB > name c4 'Sample'

MTB > set c4

DATA> 6(1:12)

DATA> end

One way to make a similar table to the one in the unit is as follows:

MTB > table c3 c4 c2;

SUBC> data c1.

Tabulated statistics: Tube, Sample, Bottle

Results for Bottle = 1

Rows: Tube Columns: Sample

1 2 3 4 5 6 7 8 9 10 11 12

1 1 3 3 2 2 1 5 1 0 3 0 0

1 2 3 1 2 0 3 0 2 1 0 0

2 1 4 2 4 1 1 5 1 1 4 0 1

3 1 3 0 4 2 5 2 2 1 2 3

3 1 2 4 1 3 2 5 1 2 5 4 2

2 3 6 0 6 1 5 0 2 3 1 1

Cell Contents: Count : DATA

Results for Bottle = 2

Rows: Tube Columns: Sample

1 2 3 4 5 6 7 8 9 10 11 12

1 1 3 3 2 2 1 5 1 0 3 0 0

1 2 3 1 2 0 3 0 2 1 0 0

2 1 4 2 4 1 1 5 1 1 4 0 1

3 1 3 0 4 2 5 2 2 1 2 3

3 1 2 4 1 3 2 5 1 2 5 4 2

2 3 6 0 6 1 5 0 2 3 1 1

Cell Contents: Count : DATA

11.2.1. "Write the model for this experiment, including all restrictions and distributional assumptions. For uniformity, use b and i for bottles, t and j for tubes, and S and k for samples. If you were to try including the three-way interaction in the model, what would its subscripts have to be? What other term in the model has these same subscripts?"

The model for this three way mixed ANOVA is as follows:

Yijk = m + bi + tj + Sk + (bt)ij + (bS)ik + (tS)jk + eijk

with i = 1, 2; j = 1, 2, 3; k = 1, 2, ..12;

and the distributional assumptions:

eijk iid N(0,s 2); Sk iid N(0,sS2);

and (bS)ik ~ N(0, sbS2), (tS)jk ~ N(0, stS2), all independently.

If a three-way interaction term were included in the model, its subscript would be ijk which is identical to the residual term.

11.2.2. "Perform the Minitab procedure that produces output identical to that shown in this section. Declare Sample as random, choose the restricted model, make the EMS table, and store the residuals. What happens in Minitab if you try to include the disallowed three-way interaction term in the Minitab model?"

The three way ANOVA without the three-way interaction term is shown below: (We do not show storing of residuals, which might be used to make a normal probability plot later. A residual plot can also be selected via menus in the path STAT Þ ANOVA Þ Balanced ANOVA.)

MTB > anova c1 = c2 | c3 | c4 - c2*c3*c4;

SUBC> random c4;

SUBC> ems.

ANOVA: Count versus Bottle, Tube, Sample

Factor Type Levels Values

Bottle fixed 2 1, 2

Tube fixed 3 1, 2, 3

Sample random 12 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

Analysis of Variance for Count

Source DF SS MS F P

Bottle 1 0.347 0.347 0.14 0.715

Tube 2 14.528 7.264 7.01 0.004

Sample 11 93.486 8.499 3.54 0.036 x

Bottle*Tube 2 1.694 0.847 0.77 0.476

Bottle*Sample 11 27.153 2.468 2.23 0.052

Tube*Sample 22 22.806 1.037 0.94 0.559

Error 22 24.306 1.105

Total 71 184.319

x Not an exact F-test.

S = 1.05109 R-Sq = 86.81% R-Sq(adj) = 57.44%

Variance Error Expected Mean Square for Each

Source component term Term (using unrestricted model)

1 Bottle 5 (7) + 3 (5) + Q[1,4]

2 Tube 6 (7) + 2 (6) + Q[2,4]

3 Sample 1.01641 * (7) + 2 (6) + 3 (5) + 6 (3)

4 Bottle*Tube 7 (7) + Q[4]

5 Bottle*Sample 0.45455 7 (7) + 3 (5)

6 Tube*Sample -0.03409 7 (7) + 2 (6)

7 Error 1.10480 (7)

* Synthesized Test.

Error Terms for Synthesized Tests

Synthesis of

Source Error DF Error MS Error MS

3 Sample 8.75 2.400 (5) + (6) - (7)

In the first line above, the model could have been written:

MTB > anova c1 = c2 c3 c4 c2*c3 c2*c4 c3*c4;

However, the following model specification incorrectly includes the three-way interaction

MTB > anova c1 = c2 | c3 | c4;

and so results in an error message

Note that if the three-way term is included, Minitab gives the following error message:

* ERROR * Zero or negative degrees of freedom.

Recall that the three-way interaction term is indistinguishable from the error term because there is only one replication per cell.

11.2.3. "Translate Minitab's notation in the EMS table into the symbols you used in your model in Problem 11.2.1.
Verify that each F-ratio in the ANOVA table uses the denominator suggested by the EMS table."

Source Error Term Expected Mean Square

1 Bottle 5 (7) + 3 (5) + Q[1,4] = s2+ 3s2bS + qb,bt

2 Tube 6 (7) + 2 (6) + Q[2,4] = s2+ 2s2tS + qt,bt

3 Sample 7 (7) + 2 (6) + 3 (5) + 6 (3) = s2+ 2s2tS + 3s2bS + 6s2S

4 Bottle*Tube 7 (7) + Q[4] = s2+ qbt

5 Bottle*Sample 7 (7) + 3 (5) = s2+ 3s2bS

6 Tube*Sample 7 (7) + 2 (6) = s2+ 2s2tS

7 Error (7) = s2

where the q's refer to the obvious rows in the EMS table, but are not defined in this course.

Fb = MSb/MSbS = .347/2.468 = 0.14

Ft = MSt/MStS = 7.264/1.037 = 7.00

FS = synthetic test statistic

Fbt = MSbt/MSerror = .847/1.105 = .77

FbS = MSbS/MSerror = 2.468/1.105 = 2.23

FtS = MStS/MSerror = 1.037/1.105 = .94

The respective results from the Minitab output differ only by rounding error.

Minitab retains more decimal places in memory than are displayed in printouts.

The synthesized error term for testing the Sample main effect is denoted as (5) + (6) - (7).

That is MSbS + MStS – MSerror = 2.468 + 1.037 – 1.105 = 2.400

Notice that E[MSbS + MStS – MSerror] = (s2+ 3s2bS) + (s2+ 2s2tS) – s2 = s2+ 3s2bS+ 2s2tS,
which agrees with E(MSS), except for its last term, making this synthesized MS (2.400) an "appropriate"
denominator of the F- statistic for testing the Sample main effect.

It is appropriate in the sense that it has the correct mean, but it does not have an exact chi-squared distribution.

Minitab gives an approximate DF = 8.75, meaning that the true distribution of the synthesized MS is approximately chi-squared with that DF. The formula for getting such an approximate DF is beyond the scope of this course.

11.2.4. "Make a normal probability plot of the residuals and give the P-value of the Anderson-Darling test. Also check for equality of variances among levels of bottle and tube."

The P-value of the Anderson-Darling test is 0.481. Since this value is greater than 0.05, the null hypothesis is not rejected at the 5% significance level. The data are essentially normal.

Because binomial data may yield unequal variances (especially from tube to tube, owing to the significance of the Tube effect) we explicitly test for differences in variances. None are found. This may be because the power of the tests involved is very poor.

However, the differences in average counts are not large:

MTB > Table c2 c3;

SUBC> means c1.

Tabulated statistics: Bottle, Tube

Rows: Bottle Columns: Tube

1 2 3 All

1 1.750 2.083 2.667 2.167

2 1.250 2.333 2.500 2.028

All 1.500 2.208 2.583 2.097

Cell Contents: Count : Mean

11.2.5. "Because none of the interactions in the model is significant (5% level), the issue that disorderly interaction might complicate the interpretation of the Bottle effect does not arise. Nevertheless, make the complete set of interaction (profile) plots. Comment as appropriate."

All but the Bottle * Tube plot show crossings of paths. Apparently, lack of parallel structure throughout the above display are due to random fluctuations (mainly in Samples) rather than to a significant interaction effects. Lack of parallel structure seems most serious in the Bottle * Sample plot, which corresponds to the interaction that was nearly significant.

11.2.6. "Remove the non-significant interaction terms from the model and run the resulting ANOVA. (Or maybe there is one of them you'll want to keep.) Is the interpretation the same as for the full model?"

Removing the | symbols in the ANOVA command (to remove all the interactions) results in the following:

MTB > anova c1 = c2 c3 c4;

SUBC> random c4;

SUBC> ems.

ANOVA: Count versus Bottle, Tube, Sample

Factor Type Levels Values

Bottle fixed 2 1, 2

Tube fixed 3 1, 2, 3

Sample random 12 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

Analysis of Variance for Count

Source DF SS MS F P

Bottle 1 0.347 0.347 0.26 0.612

Tube 2 14.528 7.264 5.45 0.007

Sample 11 93.486 8.499 6.38 0.000

Error 57 75.958 1.333

Total 71 184.319

S = 1.15438 R-Sq = 58.79% R-Sq(adj) = 48.67%

Expected Mean

Square for Each

Term (using

Variance Error unrestricted

Source component term model)

1 Bottle 4 (4) + Q[1]

2 Tube 4 (4) + Q[2]

3 Sample 1.194 4 (4) + 6 (3)

4 Error 1.333 (4)

In the original ANOVA, the F tests for Tube and Sample were significant. The F tests for Sample and the other interactions were not. If the interactions are omitted, and the SS and DF for the interactions are included in the error term, but the interpretation for the main effects is unchanged.

It might also be argued that it is a bad idea to remove the nearly significant (P = 0.052) interaction,
Bottle * Sample. Here we investigate what happens if we leave it in. We see that in the absence of the
other two interaction terms, this interaction is now significant at the 5% level.

MTB > anova c1 = c2 c3 c4 c2*c4;

SUBC> random c4;

SUBC> ems.

ANOVA: Count versus Bottle, Tube, Sample

Factor Type Levels Values

Bottle fixed 2 1, 2

Tube fixed 3 1, 2, 3

Sample random 12 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

Analysis of Variance for Count

Source DF SS MS F P

Bottle 1 0.347 0.347 0.14 0.715

Tube 2 14.528 7.264 6.85 0.002

Sample 11 93.486 8.499 3.44 0.026

Bottle*Sample 11 27.153 2.468 2.33 0.023 # NOW SIGNIFICANT

Error 46 48.806 1.061

Total 71 184.319

S = 1.03004 R-Sq = 73.52% R-Sq(adj) = 59.13%

Expected Mean Square

Variance Error for Each Term (using

Source component term unrestricted model)

1 Bottle 4 (5) + 3 (4) + Q[1]

2 Tube 5 (5) + Q[2]

3 Sample 1.0051 4 (5) + 3 (4) + 6 (3)

4 Bottle*Sample 0.4691 5 (5) + 3 (4)

5 Error 1.0610 (5)

11.2.7 "The normal probability plot of Problem 11.2.4 shows that the data are not far from being normal. Even so, for binomial data differences in population means implies differences in population variances and so a variance stabilizing transformation may be appropriate. Variances for binomial counts can be stabilized by transforming the data as follows: first divide each count by 10 to get a proportion, then take the square root of each proportion, and finally take the arcsine of each result. (For short, this transformation is often called the "arcsine" transformation.) In Minitab you can make this transformation by following the menu path CALC > Calculator and selecting the appropriate operations or (maybe more easily) with the command let c11 = asin(sqrt(c1/10)). Analyze the transformed data. Is the interpretation the same as for the untransformed data? Does the normal probability plot of residuals look substantially different than the one for the original count data?"

Performing the arcsine transformation on the square root of the proportion of Count and repeating the original ANOVA (with all 2-way interactions) results in the following:

ANOVA: Count2 versus Bottle, Tube, Sample

Factor Type Levels Values

Bottle fixed 2 1, 2

Tube fixed 3 1, 2, 3

Sample random 12 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

Analysis of Variance for Count2

Source DF SS MS F P

Bottle 1 0.01863 0.01863 0.35 0.566

Tube 2 0.36392 0.18196 6.24 0.007

Sample 11 1.87051 0.17005 3.00 0.048 x

Bottle*Tube 2 0.04678 0.02339 0.91 0.419

Bottle*Sample 11 0.58639 0.05331 2.06 0.071

Tube*Sample 22 0.64125 0.02915 1.13 0.389

Error 22 0.56809 0.02582

Total 71 4.09557

x Not an exact F-test.