252solnF3 11/07/03 (Open this document in 'Page Layout' view!)

F. ANALYSIS OF VARIANCE

1. 1-Way Analysis of Variance

Text 11.1-11.6, 11.7**, 11.8 [11.1- 11.7, 11.8*] (11.1- 11.7, 11.8* (Same problem, different numbers – both answers will be posted)

2. 2 -Way Analysis of Variance

Text 11.15-11.18, 11.23, 11.29-11.32, 11.36 [11.15-11.18, 11.23, 11.28-11.30, 11.34] (11.15-11.18, 11.23, 11.28-11.30, 11.34), F1, F2, F4

3. More than 2-Way analysis of Variance

F3

4. Kruskal-Wallis Test

Text 12.86-12.87, 12.89 [11.39-11.40, 11.42] (11.39-11.40, 11.42), Downing and Clark 18-12, 18-13 (in chapter 17 in D&C 3rd edition),

5. Friedman Test

Text 12.93-12.95 [11.46-11.48] (11.65-11.67 on CD) Downing and Clark 18-4, 18-6 (in chapter 17 in D&C 3rd edition),

Graded Assignment 4 (Will be posted)

This document includes Problem F3, all problems in Chapter 12 and the four problems in Downing and Clark.

------

3-Way ANOVA Problem.

Problem F3: 48 measurements describe the time it took a group of truckers to get from their terminal to a destination. The trip times were characterized by driver’s experience (Factor A – 2 levels), route (Factor B – 3 levels) and season (Factor C – 2 levels). For each combination of factors there are 4 measurements. Set up the ‘degrees of freedom’ column of and ANOVA table showing all interactions.

Solution: If we multiply the levels of the factors together and then multiply by the number of measurements per cell, we find a total of measurements.

Source / SS / DF / MS / F /
Experience (A) / 500 / 1
Route (B) / 400 / 2
Season (C) / 300 / 1
Interaction (AB) / 50 / 2
Interaction (AC) / 60 / 1
Interaction (BC) / 70 / 2
Interaction (ABC) / 2
Within / 100 / 36
Total / 1600 / 47

Question: I have put some numbers, pretty much at random, in the SS column. Are you ready to (i) Calculate the missing number in the SS column? (ii) Compute the MS column? (iii) Get all the values in the F column by dividing the within (error) mean square into the other mean squares? (iv) Look up the appropriate values of F on the table? List the seven hypotheses that would be tested by these F tests and to say which ones should be rejected.


252solnF3 11/07/03

Kruskal-Wallis Test Problems

Exercise 12.86 [11.39 in 8th and 9th]: Solutions are repeated, edited, from the Instructor’s Solution Manual

11.39 For the 0.01 level of significance and 5 degrees of freedom, .

Exercise 12.87 [11.40 in 8th and 9th]: Assume that each group is too large for the K-S table.

11.40 (a) Decision rule: If H > , reject H0.

(b) Decision: Since Hcalc = 13.77 is below the critical bound of 15.086, do not reject H0.

Exercise 12.88 [11.41 in 8th and 9th]: This wasn’t assigned, but the Minitab printout should give you some practice. NOBS means number of observations.

11.41 H0: H1: At least one of the medians differs.

Decision rule: If = 9.210, reject H0. Test statistic: H = 0.64

Decision: Since Hcalc = 0.64 is below the critical bound of 9.210 or because the p-value is above , do not reject H0. There is insufficient evidence to show any real difference in the median reaction times for the three learning methods.

Minitab Output

Kruskal-Wallis Test

LEVEL NOBS MEDIAN AVE. RANK Z VALUE

1 9 10.00 11.6 -0.74

2 8 15.50 13.3 0.12

3 8 12.50 14.4 0.64

OVERALL 25 13.0

H = 0.64 d.f. = 2 p = 0.728

Exercise 12.89 [11.42 in 8th and 9th]:

11.42 (a) H0: Where 1 is Low, 2 is Normal, 3 is High and 4 is very high.

H1: At least one of the medians differs.

First we rank the data. The data appears below in columns marked to and the ranks are in columns marked to .

Row Low Normal High Very High

1 8.0 11 7.6 8 6.0 4 5.1 1

2 8.1 12 8.2 13 6.3 5 5.6 2

3 9.2 15 9.8 17 7.1 7 5.9 3

4 9.4 16 10.9 18 7.7 9 6.7 6

5 11.7 19 12.3 20 8.9 14 7.8 10

73 76 39 22

5 5 5 5


252solnF3 11/07/03

To check the ranking, note that the sum of the four rank sums is 73 + 76 + 39 + 22 = 210, and that the sum of the first numbers is .

Now, compute the Kruskal-Wallis statistic . If we look up this result in the Kruskal-Wallis table (Table 9), we find that the problem is too large for the table. If the size of the problem is larger than those shown in Table 9, use the distribution, with , where is the number of columns. Since there are columns, we have 3 degrees of freedom. If we try to locate on the chi- squared table, we find that and , so the p-value is between .01 and .005. In particular if our significance level is 5%, compare with . Since is larger than , reject the null hypothesis.

This data set was run on Minitab with the following results.

————— 11/7/2003 6:36:24 PM ————————————————————

Welcome to Minitab, press F1 for help.

MTB > Retrieve "C:\Berenson\Data_Files-9th\Minitab\BATFAIL.MTW".

Retrieving worksheet from file: C:\Berenson\Data_Files-9th\Minitab\BATFAIL.MTW

# Worksheet was saved on Tue Mar 31 1998

Results for: 252BATFAIL.MTW

MTB > print c1 c2

Data Display

Row Time Pressure

1 8.0 1

2 8.1 1

3 9.2 1

4 9.4 1

5 11.7 1

6 7.6 2

7 8.2 2

8 9.8 2

9 10.9 2

10 12.3 2

11 6.0 3

12 6.3 3

13 7.1 3

14 7.7 3

15 8.9 3

16 5.1 4

17 5.6 4

18 5.9 4

19 6.7 4

20 7.8 4

MTB > Kruskal-Wallis c1 c2.

252solnF3 11/07/03

Kruskal-Wallis Test: Time versus Pressure

Kruskal-Wallis Test on Time

Pressure N Median Ave Rank Z

1 5 9.200 14.6 1.79

2 5 9.800 15.2 2.05

3 5 7.100 7.8 -1.18

4 5 5.900 4.4 -2.66

Overall 20 10.5

H = 11.91 DF = 3 P = 0.008

The p – value of .008 is below the significance level, so we reject the null hypothesis.

(b) According to the Instructor’s Solution Manual, there is sufficient evidence to show there is a significant difference in the four pressure levels with respect to median battery life. The warranty policy should exploit the highest median battery life and explicitly specify that such median battery life level can only be warranted when the batteries are operated under normal pressure level.


252solnF3 4/15/02

Downing and Clark, Chapter 17,Application 12: The benefits paid to employees of three yo-yo manufacturers appear below. Test the hypothesis that the benefits expenditures of the three companies have the same distribution.

Solution: The original data appears in the left three columns and the rankings appear in the next three.

Original Data / Ranks of Data
Company
A / Company
B / Company
C / Company
A / Company
B / Company
C
10 / 25 / 16 / 1 / 16 / 7
26 / 12 / 24 / 17 / 3 / 15
29 / 20 / 13 / 20 / 11 / 4
21 / 11 / 22 / 12 / 2 / 13
17 / 27 / 15 / 8 / 18 / 6
23 / 19 / 28 / 14 / 10 / 19
30 / 14 / 18 / 21 / 5 / 9
31 / 38 / 35 / 22 / 29 / 26
39 / 32 / 37 / 30 / 23 / 28
33 / 36 / 34 / 24 / 27 / 25
/ 169 / 144 / 152
/ 10 / 10 / 10

The null hypothesis is or, if the parent distributions are assumed non-normal, . We use a Kruskal-Wallis test instead of a Friedman test because the data appear to be three independent random samples.

To check the ranking, note that the sum of the three rank sums is 169 + 144 + 152 = 465, and that the sum of the first numbers is

Now, compute the Kruskal-Wallis statistic . If we look up this result in the Kruskal-Wallis table (Table 9) , we find that the size of the data set is too large for the table. If the size of the problem is larger than those shown in Table 9, use the distribution, with , where is the number of columns. Since there are columns, we have two degrees of freedom. If we try to locate on the chi-squared table, we find that and , so the p-value is between .10 and .90. In particular if our significance level is 5%, compare with . Since is smaller than , do not reject the null hypothesis.


252solnF3 4/15/02

Downing and Clark, Chapter 17,Application 13: Four experimental precision scales (A, B, C, D) are tested on a fixed weight with the results below. Test the null hypothesis that the distributions of values given by the four scales are the same. (The text uses for this problem.)

Solution: The null hypothesis is or, if the parent distributions are assumed non-normal, . We use a Kruskal-Wallis test instead of a Friedman test because the data appear to be four independent random samples. In this case, there is no way the data could be cross-classified, since the column lengths are unequal.

Original Data / Ranks of Data
Scale
A / Scale
B / Scale
C / Scale
D / Scale
A / Scale
B / Scale
C / Scale
D
103 / 112 / 131 / 129 / 1 / 3 / 2 / 5
121 / 105 / 104 / 111 / 4 / 7 / 6 / 8
106 / 132 / 130 / 122 / 12 / 10 / 11 / 9
120 / 136 / 108 / 137 / 14 / 15 / 13 / 16
114 / 109 / 123 / 107 / 18 / 22 / 17 / 20
128 / 138 / 119 / 110 / 19 / 24 / 21 / 27
116 / 135 / 113 / 139 / 26 / 30 / 23 / 35
126 / 133 / 118 / 33 / 25 / 37
124 / 127 / 34 / 28
117 / 134 / 36 / 29
125 / 31
115 / 32
/ 94 / 214 / 238 / 157
/ 7 / 10 / 12 / 8

To check the ranking, note that the sum of the four rank sums is 94 + 214 + 238 + 157 = 703, and that the sum of the first numbers is .

Now, compute the Kruskal-Wallis statistic . If we look up this result in the Kruskal-Wallis table (Table 9) , we find that the size of the data set is too large for the table. If the size of the problem is larger than those shown in Table 9, use the distribution, with , where is the number of columns. Since there are columns, we have three degrees of freedom. Since our significance level is 10%, compare with . Since is smaller than , do not reject the null hypothesis.


252solnF3 11/07/03

Friedman Test Problems

Exercise 12.93[11.46 in 9th] (11.65 on CD in 8th edition): Solutions are repeated, edited, from the Instructor’s Solution Manual

11.46 d.f. = 5, = 0.1,

Exercise 12.94 [11.47 in 9th edition] (11.66 on CD in 8th edition):

11.47 (a) H0: H1: At least one of the medians differs.

If the appropriate values cannot be found on the Friedman table, use and reject H0 if > 9.2363.

(b) Since = 11.56 > 9.2363, reject H0. There is enough evidence that the medians are different.

Exercise 12.95 [11.48 in 9th] (11.67 on CD in 8th edition):

11.48

(a) H0: Where 1 is A, 2 is B, 3 is C and 4 is D.

H1: At least one of the medians differs.

First we rank the data within rows. The data appears below in columns marked to and the ranks are in columns marked to .

Row Brand A Brand B Brand C Brand D

1 24 2 26 4 25 3 22 1

2 27 3.5 27 3.5 26 2 24 1

3 19 2 22 4 20 3 16 1

4 24 2 27 4 25 3 23 1

5 22 2.5 25 4 22 2.5 21 1

6 26 3 27 4 24 1.5 24 1.5

7 27 4 26 3 22 1 23 2

8 25 3 27 4 24 2 21 1

9 22 3 23 4 20 2 19 1 .

25 34.5 20 10.5

To check the ranking, note that the sum of the four rank sums is 25 + 34.5 + 20 + 10.5 = 90, and that the sum of the rank sums should be .

Now compute the Friedman statistic

.

Since the size of the problem is larger than those shown in Table 8, use the distribution, with , whereis the number of columns. Since if , compare with . Since is larger than , reject the null hypothesis.

This problem was run on Minitab with the following results.


252solnF3 11/07/03

————— 11/7/2003 8:32:16 PM ————————————————————

Welcome to Minitab, press F1 for help.

MTB > Retrieve "C:\Berenson\Data_Files-9th\Minitab\COFFEE.MTW".

Retrieving worksheet from file: C:\Berenson\Data_Files-9th\Minitab\COFFEE.MTW

# Worksheet was saved on Thu Nov 06 2003

Results for: COFFEE.MTW

MTB > print c1 c2 c3

Data Display

Row Expert Brand Rating

1 1 1 24

2 1 2 26

3 1 3 25

4 1 4 22

5 2 1 27

6 2 2 27

7 2 3 26

8 2 4 24

9 3 1 19

10 3 2 22

11 3 3 20

12 3 4 16

13 4 1 24

14 4 2 27

15 4 3 25

16 4 4 23

17 5 1 22

18 5 2 25

19 5 3 22

20 5 4 21

21 6 1 26

22 6 2 27

23 6 3 24

24 6 4 24

25 7 1 27

26 7 2 26

27 7 3 22

28 7 4 23

29 8 1 25

30 8 2 27

31 8 3 24

32 8 4 21

33 9 1 22

34 9 2 23

35 9 3 20

36 9 4 19


252solnF3 11/07/03

MTB > Friedman c3 c2 c1.

Friedman Test: Rating versus Brand, Expert

Friedman test for Rating by Brand blocked by Expert

S = 20.03 DF = 3 P = 0.000

S = 20.72 DF = 3 P = 0.000 (adjusted for ties)

Est Sum of

Brand N Median Ranks

1 9 25.000 25.0

2 9 26.750 34.5

3 9 24.000 20.0

4 9 22.250 10.5

Grand median = 24.500

Since the p-value is essentially zero, reject H0 at 0.05 level of significance. There is evidence of a difference in the median summated ratings of the four brands of Colombian coffee.

(b) In (a), we conclude that there is evidence of a difference in the median summated ratings of the four brands of Colombian coffee while in problem 11.23, we conclude that there is evidence of a difference in the mean summated ratings of the four brands of Colombian coffee.


252solnF3 11/07/03