Analysis of Variance

Chapter 9

Analysis of variance

9-1. H0: X X X X

H1: X X X X All 4 are different

X X X X 2 equal; 2 different

X X X X 3 equal; 1 different

X X X X 2 equal; other 2 equal but different from first 2

9-2. ANOVA assumptions: normal populations with equal variance. Independent random sampling from the r populations.

9-3. Series of paired t-test are dependent on each other. There is no control over the probability of a Type I error for the joint series of tests.

9-4. r = 5 n1 = n2 = . . . = n5 = 21 n =105

df’s of F are 4 and 100. Computed F = 3.6. The p-value is close to 0.01. Reject H0. There is evidence that not all 5 plants have equal average output.

F Distribution
a / 10% / 5% / 1% / 0.50%
(1-Tail) F-Critical / 2.0019 / 2.4626 / 3.5127 / 3.9634

9-5. r = 4 n1 = 52 n2 = 38 n3 = 43 n4 = 47

Computed F = 12.53. Reject H0. The average price per lot is not equal at all 4 cities. Feel very strongly about rejecting the null hypothesis as the critical point of F (3,176) for = .01 is approximately 3.8.

F Distribution
a / 10% / 5% / 1% / 0.50%
(1-Tail) F-Critical / 2.1152 / 2.6559 / 3.8948 / 4.4264

9-6. Originally, treatments referred to the different types of agricultural experiments being performed on a crop; today it is used interchangeably to refer to the different populations in the study.Errors are the differences between the data points and their sample means.

9-7. Because the sum of all the deviations from a mean is equal to 0.

9-8. Total deviation = xij – = ( – ) + = treatment deviation + error deviation.

9-9. The sum of squares principle says that the sum of the squared total deviations of all the data points is equal to the sum of the squared treatment deviations plus the sum of all squared error deviations in the data.

9-10. An error is any deviation from a sample menu that is not explained by differences among populations. An error may be due to a host of factors not studied in the experiment.

9-11. Both MSTR and MSE are sample statistics given to natural variation about their own means.

(If > we cannot immediately reject H0 in a single-sample case either.)

9-12. The main principle of ANOVA is that if the r population means are not all equal then it is likely that the variation of the data points about their sample means will be small compared to the variation of the sample means about the grand mean.

9-13. Distances among populations means manifest themselves in treatment deviations that are large relative to error deviations. When these deviations are squared, added, and then divided by df’s, they give two variances. When the treatment variance is (significantly) greater than the error variance, population mean differences are likely to exist.

9-14. Within: error (data – sample mean)

Between (among): treatment (sample mean – grand mean)

Unexplained: error

Explained: treatment

9-15 SST = SSTR + SSE, but this does not equal MSTR + MSE. A counterexample:

Let n = 21 r = 6 SST = 100 SSTR = 85 SSE = 15

Then SST = SSTR + SSE = 85 + 15 = 100.

But =

9-16. When the null hypothesis of ANOVA is false, the ratio MSTR/MSE is not the ratio of two independent, unbiased estimators of the common population variance , hence this ratio does not follow an F distribution.

9-17. For each observation , we know that

(tot.) = (treat.) + (error): – = ( – ) +

Squaring both sides of the equation:

( – )2 = (– )2 + 2(– )( – ) + ( – )2

Now sum this over all observations (all treatments i = 1, . . . , r; and within treatment i, all observations j = 1, . . . , ni:

( – )2 = (– )2 + 2(– )( – ) + ( – )2

Notice that the first sum of the R.H.S. here equals ni(– )2 since for each i the

summand doesn’t vary over each of the ni) values of j. Similarly the second sum is

2[(– )( – )]. But for each fixed i, ( – ) = 0 since this is just the sum

of all deviations from the mean within treatment i. Thus the whole second sum in the long R.H.S. above is 0, and the equation is now

( – )2 = ni(– )2 + ( – )2

which is precisely Equation (9-12).

9-18. (From Minitab):

Source / df / SS / MS / F
Treatment / 2 / 381127 / 190563 / 20.71
Error / 27 / 248460 / 9202
Total / 29 / 629587

The critical point for F (2,27) at = 0.01 is 5.49. Therefore, reject H0. The average range of the 3 prototype planes is probably not equal.

a
ANOVA Table / 5%
Source / SS / df / MS / F / Fcritical / p-value
Between / 381127 / 2 / 190563.33 / 20.7084038 / 3.3541312 / 0.0000 / Reject
Within / 248460 / 27 / 9202.2222
Total / 629587 / 29

9-19. (From Minitab):

Source / df / SS / MS / F
Treatment / 3 / 0.1152 / 0.0384 / 1.47
Error / 28 / 0.7315 / 0.0261
Total / 31 / 0.8467

Critical point F (3,28) for = 0.01 is 4.568. Therefore we cannot reject H0. There is no evidence of differences in the average price per barrel of oil from the four sources. The Rotterdam oil market may be efficient. The conclusion is valid only for Rotterdam, and only for Arabian Light. Also, non-rejection of a null hypothesis is a weak conclusion. We need to assume independent random samples from these populations, normal populations with equal population variance. Observations are time-dependent (days during February), thus the assumptions could be violated. This is a limitation of the study. Another limitation is that February may be different from other months.

a
ANOVA Table / 5%
Source / SS / df / MS / F / Fcritical / p-value
Between / 0.11516 / 3 / 0.0383865 / 1.46931301 / 2.9466847 / 0.2442
Within / 0.73151 / 28 / 0.0261254
Total / 0.84667 / 31

9-20. n1 = 20 n2 = 18 n3 = 21 SSE = 1,240 SSTR = 740

Source / df / SS / MS / F
Treatment / 2 / 740 / 370.00 / 16.71
Error / 56 / 1240 / 22.14
Total / 58 / 1980

16.71 > F (2,56) for = 0.01, which is about 4.98. Therefore reject H0. Not all sweater types last the same time, on average.

9-21. (From Minitab):

Source / df / SS / MS / F
Treatment / 2 / 91.0426 / 45.5213 / 12.31
Error / 38 / 140.529 / 3.69812
Total / 40 / 231.571

p-value = .0001. Critical point for F (2,38) at = .05 is 3.245. Therefore, reject H0. There is a difference in the length of time it takes to make a decision.

a
ANOVA Table / 5%
Source / SS / df / MS / F / Fcritical / p-value
Between / 91.0426 / 2 / 45.521302 / 12.3093042 / 3.2448213 / 0.0001 / Reject
Within / 140.529 / 38 / 3.6981215
Total / 231.571 / 40

9-22. n1 = 50 n2 = 32 n3 = 28 SSE = 22,399.8 SSTO = 32,156.1

Source / df / SS / MS / F
Treatment / 2 / 9756.3 / 4848.15 / 23.3
Error / 107 / 22399.8 / 209.34
Total / 109 / 32156.1

23.3 > F .01(2,107) = 4.82. Therefore, reject H0. Differences in average annualized returns do exist. Manager should shift proportions of the fund.

9-23. r = 8 ni = 100 for all i = 1, . . . , 8 n = 800

SSTR = 45,210 SSTO = 92,340

Source / df / SS / MS / F
Treatment / 7 / 45210 / 6458.57 / 108.53
Error / 792 / 47130 / 59.51
Total / 799 / 92340

108.53 is much greater than the value of F .01(7,792), which is approximately 2.66. Reject H0. There is evidence that not all 8 brands have equal average consumer quality ratings.

9-24. 95% C.I. for the mean responses:

Martinique: = 75 1.96 = [68.04, 81.96]

Eleuthera: 73 1.96 = [66.04, 79.96]

Paradise Island: 91 1.96 = [84.04, 97.96]

St. Lucia: 85 1.96 = [78.04, 91.96]

9-25. Where do differences exist in the circle-square-triangle populations from Table 9-1, using Tukey? From the text: MSE = 2.125

triangles: n1 = 4 = 6

squares: n2 = 4 = 11.5

circles: n3 = 3 = 2

For = .01, q(r,n-r) = q 0.01(3,8) = 5.63 Smallest ni is 3:

T = = 5.63 = 4.738

| - | = 5.5 > 4.738 sig.

| - | = 9.5 > 4.738 sig.

| - | = 4.0 < 4.738 n.s.

Thus: “1 = 3”; “2 > 1”; “2 > 3”

9-26. Find which prototype planes are different in Problem 9-18:

MSE = 9,202 ni = 10 for all i = 4,407 = 4,230 = 4,135

For = .05, q(3,27) = approximately 3.51. T = 3.51 = 106.475

| - | = 177 > 106.475 sig.

| - | = 95 < 106.475 n.s.

| - | = 272 > 106.475 sig.

Prototype A is shown to have higher average range than both B and C. Prototypes B and C have no significant difference in average range (all conclusions are at = 0.05).

Tukey test for pairwise comparison of group means
A
r / 3 / B / Sig / B
n - r / 27 / C / Sig / C
q0 / 3.51
T / 106.476

9-27. Since H0 was not rejected in Problem 9-19, there are no significant differences. (Let’s try anyway.) T = q0.05(4,28) = 0.22

|17.996 – 17.984| = 0.012

|17.996 – 18.136| = 0.140

|17.996 – 18.048| = 0.052

|17.984 – 18.136| = 0.152

|17.984 – 18.048| = 0.064

|18.136 – 18.048| = 0.088

All are < 0.22, thus not significant¾as expected.

Tukey test for pairwise comparison of group means
U.K.
r / 4 / Mexico / Mexico
n - r / 28 / U.A.E. / U.A.E.
q0 / 3.87 / Oman / Oman
T / 0.22116

9-28. = 6.4 = 2.5 = 4.9 MSE = 22.14

q0.05(3,56) = 3.4 T = 3.4 = 3.77

I,P: |6.4 – 2.5| = 3.9 > 3.77

I,S: |6.4 – 4.9| = 1.5 < 3.77

P,S: |2.5 – 4.9| = 2.4 < 3.77

Only significant difference at 0.05 is I,P. There is evidence that , but I,S and P,S are not significantly different.

9-29. MSE = 59.51 ni = 100 for all i = 0.05

q 0.01(8,792) = 4.99 T = 4.99 = 3.85

Absolute differences (the starred ones denoting that the difference is greater than 3.85, thus significant at = 0.01):

|M – G| = 1, |M – P| = 5*, |M – Z| = 17*, |M – S| = 11*, |M – Ph| = 12*, |M –Sl| = 13*,

|M – R| = 10*, |G – P| = 4*, |G – Z| = 16*, |G – S| = 10*, |G – Ph| = 11*, |G – Sl| = 12*,

|G – R| = 9*, |P – Z| = 12*, |P – S| = 6*, |P – Ph| = 7*, |P – Sl| = 8*, |P – R| = 5*, |Z –S| = 6*,

|Z – Ph| = 5*, |Z – Sl| = 4*, |Z – R| = 7*, |S – Ph| = 1, |S – Sl| = 2, |S – R| = 1, |Ph – R| = 2,

|Ph – Sl| = 1, |Sl – R| = 3. Thus the significant differences are shown below (a circle around two or more brands implies no significant difference in average ratings at = 0.01):

9-30. = 0.05 ni = 31 for all i MSE = 49.5

= 18 = 11 = 15 = 14

T = q0.05(4,120) = 3.68 = 4.65

Only investments 1 and 2 have significantly different annualized average returns, at = 0.05. | - | = 18 – 11 = 7 > 4.65

9-31. We cannot extend the results to planes built after the analysis. We used fixed effects here, not random effects. The 3 prototypes were not randomly chosen from a population of levels as would be required for the random effects model.

9-32. A randomized complete block design is a design with restricted randomization. Each block of experimental units is assigned to treatments with randomization of treatments within the block.

9-33. Fly all 3 planes on the same route every time. The route (flown by the 3 planes) is the block.

9-34. Look at the residuals. If the spread of the residuals is not equal, we probably have unequal , the assumption of equal variances is violated. A histogram of the residuals will reveal normality violations.

9-35. Otherwise you are not randomly sampling from a population of treatments, and inference is not valid for the entire “population.”

9-36. No. Rotterdam (and Arabian Light) was not randomly chosen.

9-37. If the locations and the artists are chosen randomly, we have a random effects model.

9-38. 1. Testing for possible interactions among factor levels.

2. Efficiency.

9-39. Limitations and problems: (1) We don’t know the overall significance level of the 3 tests; (2) If we have 1 observation per cell then there are 0 degrees of freedom for error. Also, for a fixed sample size there is a reduction of the df for error.

9-40. 1. As more factors are included, df for error decreases.

2. As more factors are included, we lose the control on , and the probability of at least one Type I error increases.

9-41. Table entry with the nearest denominator is F (4,150) = 3.44 at = 0.01. Hence we can reject H0 (no interaction) at a p-value < 0.01¾good evidence.

9-42. At = 0.05:

Location: F = 50.6, significant

Job type: F = 50.212, significant

Interaction: F = 2.14, n.s.

a
ANOVA Table / 5%
Source / SS / df / MS / F / Fcritical / p-value
Location / 2520.988 / 2 / 1260.49 / 50.645 / 3.1239 / 0.0000 / Reject
Job Type / 2499.432 / 2 / 1249.72 / 50.212 / 3.1239 / 0.0000 / Reject
Interaction / 212.716 / 4 / 53.179 / 2.1367 / 2.4989 / 0.0850
Error / 1792 / 72 / 24.8889
Total / 7025.136 / 80
9-43. / ABC / CBS / NBC / Source / SS / df / MS / F
Morning / 50 / 50 / 50 / Network / 145 / 2 / 72.5 / 5.16
Evening / 50 / 50 / 50 / Newstime / 160 / 2 / 80 / 5.69
Late Night / 50 / 50 / 50 / Interaction / 240 / 4 / 60 / 4.27
Error / 6200 / 441 / 14.06
Total / 6745 / 449

From table: F 0.01(4,400) = 3.36 F 0.01(2,400) = 4.66

Therefore, all are significant at = 0.01. There are interactions. There are Network main effects averaged over Newstime levels. There are Newstime main effects over Network levels.

9-44. a. Levels of task difficulty: a – 1 = 1; therefore a = 2

b. Levels of effort: b – 1 = 1; therefore b = 2

c. There are no task difficulty main effects because p-value = 0.5357

d. There are effort main effects because p-value < 0.0001

e. There are no significant interactions, as p-value = 0.1649.

9-45. a. Explained is “Treatment”: Treat = Factor A + Factor B + (AB)

b. Levels of exercise price: a – 1 = 2; therefore a = 3