6/4/99 252z9943

2. Eight Technicians are asked to take a test and then rated by their supervisors. Scores and ratings follow, with the addition of productivity figures. (Use )

Technician Test Score Performance Productivity

ranking

Armstrong83 3 180

Brubecker68 7 170

Cooper60 6 164

Dollfuss81 4 182

Ezekiel 74 5 174

Fassbinder95 1 191

Goodwrench90 2 195

Hingle 66 8 160

If you rank or , rank top to bottom.

a. Compute the correlation between and and test it for significance.(5)

b. Compute the rank correlation between and and test it for significance. Which of these two measures (rank or conventional correlation?) is most appropriate here? Why?(5) c. Compute Kendall’s W for these data and test it for significance (6)

d. Test the hypothesis that the correlation between and is .8 . (5)

Solution: A worksheet for all parts of the problem is shown below.

A 83 3 180 3 3 0 0 4 10 100 6889 9 249

B 68 7 170 6 7 -1 1 6 19 361 4624 49 476

C 60 6 164 8 6 2 4 7 21 441 3600 36 360

D 81 4 182 4 4 0 0 3 11 121 6561 16 324

E 74 5 174 5 5 0 0 5 15 225 5476 25 370

F 95 1 191 1 1 0 0 2 4 16 9025 1 95

G 90 2 195 2 2 0 0 1 5 25 8100 4 180

H 66 8 160 7 8 -1 1 8 23 529 4356 64 528

sum 617 36 1416 0 6 108 1818 48631 204 2582

a)

1

Spare Parts Computation:

1

6/4/99 252z9943

The simple sample correlation coefficient is square root of , so using in place of , we get . From the outline, if we want to test against and are normally distributed, we use . Since , we reject .

b) Remember that you were advised to rank and top to bottom. This is not the usual way of doing things, but makes sense in this case since already has the best as 1. Remember that ,and then . If we want a 2-sided test at the 99% confidence level of , compare with the 0.5% value from the Pearson’s rank correlation coefficient table. Since the table value is .7450, reject the null hypothesis. We conclude that the rank correlation is significant.

c) Following the process in the outline, compute , where . From this Kendall’s . If is disagreement, can be checked against a table for this test. But is too large for the table, so use . This has the distribution with degrees of freedom. Sinceis below our , reject .

d) I don’t believe that anyone did this section. The outline says “ We need to use Fisher's z-transformation. Let . This has an approximate mean of and a standard deviation of , so that . “
6/10/99 252z9943

3. Samples of demand for four types of sailboat sold by your firm are as follows:

West Coast East Coast Total

Pirates Revenge 74 146 220

Jolly Roger 54 110 164

Bluebeard’s Treasure 46 100 146

Ahab’s Quest 50 120 170

Total224 476 700

Do all tests at the 95% confidence level.

a. Management had initially assumed that the proportion of total sales of “Pirates Revenge” would be at most 30% of sales. Test this. (3)

b. Test the hypothesis that sales of the “Pirates Revenge” are the same proportion of sales on both the East and West Coast (4)

c. Test the hypothesis that sales on the West Coast follow a uniform distribution (i.e. that each model is the same proportion of West Coast sales) (5)

d. Test the hypothesis that the proportions of each boat sold are the same on both coasts. (5)

Solution:

a) Table 3 says the following:

Interval for / Confidence Interval / Hypotheses / Test Ratio / Critical Value
Proportion / / / /

If we check the original data, total sales of “Pirates Revenge” were 220 Out of 700, or . Thus, if we use the test ratio method . We reject if it is greater than . It is not so we do not reject .

b) From Table 3 again:

Interval for / Confidence
Interval / Hypotheses / Test Ratio / Critical Value
Difference
between
proportions
/

/
/

Or use /

Our Hypotheses are or where . If we use the test ratio method, we need to find ,and . So

.

6/10/99 252z9943

. So . Since do not reject .

c) . Since sums to 224 and there are 4 models, divide 224 by 4 to get 56. The actual comparison can be done by either summing or by summing and subtracting .

74 56 -18 324 5.78571 97.7857

54 56 2 4 0.07143 52.0714

46 56 10 100 1.78571 37.7857

50 56 6 36 0.64286 44.6429

224 224 0 8.28571 232.2857

So or . Since there are 4 items in the comparison and we have used the data to estimate 1 parameter and , we reject . The Kolmogorov-Smirnov method could also be used for this problem.

d) . The proportions in rows, , are used with column totals to get the items in . Note that row sums in are the same as in .

The actual comparison can be done by either summing or by summing and subtracting .

74 70.40 -3.60000 12.9600 0.184091 77.784

54 52.48 -1.52000 2.3104 0.044024 55.564

46 46.72 0.72000 0.5184 0.011096 45.291

50 54.40 4.40000 19.3600 0.355883 45.956

146 149.60 3.60001 12.9600 0.086631 142.487

110 111.52 1.52000 2.3104 0.020717 108.501

100 99.28 -0.72000 0.5184 0.005222 100.725

120 115.60 -4.40000 19.3600 0.167474 124.567

700 700.00 0.00000 0.87513 700.875

So or . Since , we do not reject .

6/10/99 252z9943

4. Data on passengers (in thousands), advertising (in $thousands) and (National income in $trillions) appears below. (Use)

passadvincseasona) Compute (2) (This must be done correctly

to get full credit for b.)
15102.40 1b) Compute a simple regression of passengers against
17122.72 1National income. (6)

13 82.08 1c) Compute (4)

23173.68 1d) Compute (3)

16102.56 1e) Compute ( the std deviation of the coefficient

21153.36 0of National Income) and do a confidence interval for

14102.24 0.(3)

20143.20 0f) Do a confidence interval for Passengers, when

26193.84 0income is $4.10 billion. (3) At what income will

18102.72 0this interval be smallest? (1)

17112.07 0

18132.33 0

23162.98 0

15101.94 1

16122.17 1

. You do not need all of these.

Solution:

a)

b) Spare Parts Computation:

1

1

It seems reasonable to use the notation instead of .

becomes .

c) or

( always!)

6/10/99 252z9943

d) or For other formulas for see previous exam. ( is always positive!)

e) so

f) . If and , then

From the regression formula handout , where

So .

So . This interval will be smallest when income is $2.686 billion.

6/10/99 252zz9943

5. Data from problem 4 is repeated below. (Use )

.

a. Do a multiple regression of passengers against advertising and National Income. (12)

b. Compute and adjusted for degrees of freedom for both this and the previous problem. Compare the values of adjusted between this and the previous problem. Use an F test to compare here with the from the previous problem.(5)

c. Compute the regression sum of squares and use it in an F test to test the usefulness of this regression. (5)

d. Use your regression to predict the number of passengers when we spend $13 (thousand) on advertising and National Income is $3.5 (trillion).(2)

e. The regression on the previous page was run with the command

MTW > regress C1 on 1 C3;

SUBC > dw.

As a result, the last line of the regression read

Durbin-Watson statistic = 0.71

Solution: a) First, we compute . Second, we compute , , , , and . Third, we compute our spare parts , , , , and . (Note that some of these were computed for the last problem.) Fourth, we substitute these numbers into the Simplified Normal Equations:

,

which are

and solve them as two equations in two unknowns for . We do this by multiplying the second equation by 4.5115, which is 23.0979 divided by 5.11975 so that the two equations become , we then subtract the second equation from the first to get , so that . The first of the two normal equations can now be rearranged to get , which gives us . Finally we get by solving . Thus our equation is

b) The coefficient of determination is . (The standard error is

6/10/99 252zz9943

, but we don’t need it yet.) Our results can be summarized below as:

.8104 / 15 / 1 / .7958
.9430 / 15 / 2 / .9335

, which is adjusted for degrees of freedom, has the formula , where is the number of independent variables. adjusted for degrees of freedom seems to show that our second regression is better.

The easiest way to do the F test and have it look right is to note that . For the regression with one independent variable the regression sum of squares is . For the regression with two independent variables the regression sum of squares is . The difference between these is 25.954. the remaining unexplained variation is 195.733 –184.576 = 11.157. the ANOVA table is

Source / SS / DF / MS / /
/ 158.622 / 1 / 158.622
/ 25.954 / 1 / 25.954 / 27.9105 /
Error / 11.157 / 12 / 0.9299
Total / 195.733 / 14

Since our computed is larger than the table , we reject our null hypothesis that has no effect.

c) We computed the regression sum of squares in the previous section.

Source / SS / DF / MS / /
, / 184.576 / 2 / 92.288 / 99.245 /
Error / 11.157 / 12 / 0.9299
Total / 195.733 / 14

Since our computed is larger than the table , we reject our null hypothesis that and do not explain .

d) =11.103.

e) A Durbin-Watson Test is a test for autocorrelation. For , and , the test table gives and .According to the text, the null hypothesis is ‘No Autocorrelation’ and our rejection region is or . We really should use the value for , but a check of the table leaves us sure that it is below .70. thus the D-W statistic of 0.71 is not in the rejection region. Check the examples to see that it could be in the “possibly significant” region.

6/14/99 252zz9943

6.(Watch it!) Three methods are used to train candidates for the FAA pilots exam. Scores for trainees are shown below classified by method.

Methoda. Assume that the data is normal and compare the means

Video Audio for the first two methods (Assume unequal variances) (5)

Cassette Cassette Classroomb. Do the same for all three methods (You may assume

72 73 68 equal variances now) (7)

86 75 83

80 60 50c. Test column 1 to see if it has the normal distribution (5)

91 52 91

46 84 84

68 76 77

75 94

81

92

90

Note: For the first column:

Note:

Note: In spite of the words “Watch it!, ” many people assumed that this was identical to a problem with similar data on an earlier exam. You have to read the question before answering it!

Solution: Note:

a) Assume unequal variances. From Table 3 of the Syllabus Supplement:

Interval for / Confidence Interval / Hypotheses / Test Ratio / Critical Value
Difference
between Two
Means(
unknown,
variances
assumed
unequal) /
/
Same as
/ /

Note: unequal variances like part a) here were strictly extra credit in Spring 2000!

6/14/99 252zz9943

, so use 10 degrees of freedom.

, so, using a test ratio . Since this is between , do not reject or, using a critical value, . Since is between these values, do not reject .

b) 1-way ANOVA

Method
72 / 73 / 68
86 / 75 / 83
80 / 60 / 50
91 / 52 / 91
46 / 84 / 84
68 / 76 / 77
75 / 94
81
92
. / . / 90
Sum / 518 / + 420 / +810 / = 1748
/ 7 / + 6 / + 10 / = 23
/ 74 / 70 / 81 / 76
SS / 39626 / + 30090 / + 67240 / = 136956
/ 5476 / 4900 / 6561

Note that is not a sum, but is . .

. ()

Source

/

SS

/

DF

/

MS

/ / /
Between (Methods) / 494 / 2 / 247 / 1.365 / ns / Column means equal
Within (Error) / 3614 / 20 / 181
Total / 4108 / 22

Because our computed is smaller than , we cannot reject

6/14/99 252zz9943

c) We use the Lilliefors method because we are testing for the Normal distribution, we have a small sample and the population mean and variance are unknown. The column is the cumulative distribution computed from the Normal table. is , which was computed for you.

Cumulative

46 1 1 .14286 -1.91 .0281 .1148

68 1 2 .28571 -0.41 .3409 .0552

72 1 3 .42857 -0.14 .4443 .0157

75 1 4 .57142 0.07 .5279 .0435

80 1 5 .71428 0.41 .6591 .0552

86 1 6 .85714 0.82 .7939 .0632

91 1 7 1.00000 1.16 .8770 .1230

7

From the Lilliefors Table, the critical value for a 95% confidence level is .300. Since the largest number in is not above this value, we do not reject .

6/14/99 252zz9943

7. (Watch it!) Three methods are used to train candidates for the FAA pilots exam. Scores for trainees are shown below classified by method.

Methoda. Using a sign test, check to see if it has a median of 85. (4)

Video Audio

Cassette Cassette Classroomb. Repeat the test on using a more powerful method. (5)

72 73 68 c. Apply the Runs Test as follows:

86 75 83Write down the numbers in and together in order.

80 60 50Underneath the numbers write down A if the number comes from

91 52 91 and C if it comes from. You will have a sequence like AACACC

46 84 84….. . In case of a tie remove both tying numbers from your test.

68 76 77Do a runs test on the resulting sequence to see if the A’s and C’s

75 94appear randomly.

81Congratulations! You have just done a Wald-Wolfowitz Test for the

92equality of means in two (nonnormal) samples. If the sequence is

90random, the means are equal. (6)

Solution:

1

(corrected)

68 -17 9 9-

83 -2 2 2-

50 -35 10 10-

91 6 5 5

84 -1 1 1-

77 -8 7 7-

94 9 8 8

81 -4 3 3-

92 7 6 6

90 5 4 4

23

32

a) . To do a sign test, note that there are 6 numbers below 85 and 4 above. Using the binomial table with

If this p-value is above the significance level and we do not reject .

b) To do a Wilcoxon Signed Rank Sum Test, rank the differences from 85 and put the sign of the difference next to these ranks. To check , note that . From the table the critical value is 8, since both Ts are above this value, do not reject .

1

c) The numbers written out in order are:

If we eliminate ties we get:

The number of is and the number of is and there are runs. If we look this up in the table entitled “Critical Values of r for the Runs Test, ” we fine that the upper critical value is 11 and the lower critical value is 3. Since 8 lies between these values we do not reject the null hypothesis of randomness. Out final conclusion is that the means of the populations from which the two samples come are equal.

1