The feedback that I get is that there are people who are absolutely sure that there is material we never covered on the final exam. I also remember one very indignant woman who was absolutely sure that you couldn't compare two means unless one was a population mean. This document may help.

M. REVIEW (Assume that in all the following) In each case,be sure to clearly state the null hypothesis and the location of the 'do not reject' or 'reject' region. Note that other methods that we have studied, but which are not mentioned in this review are noted in boldface.

1. Means

The following data is used in the sections on means, medians, variances and distributions. It relates to different methods of training people to pass an exam.

Method 1 Method 2 Method 3

74 78 68

88 80 83

82 65 50

93 57 91

55 89 84

70 77

94

81

92

Note:

These are only given for convenience, you should be able to compute every one of these sums.

a. Are the means for these three methods the same?

A multiple test of means is usually an Analysis of Variance, For this to be valid, we must assume that the samples come from populations with an underlying Normal distribution and common variance. Note that data is not cross-classified so that a 2-way ANOVA is not applicable.

Because there are many examples available, the computation of SSB and SST is not shown here. You should be able to do that computation. When you are finished, you should have an ANOVA table with the numbers below.

Source / SS / DF / MS / / /
Between / 126 / 2 / 63 / 0.34 / ns / Column means equal
Within / 3131 / 17 / 184
Total / 3257 / 19

Explanation: This is one-way ANOVA because the columns are considered three independent random samples with no cross classification. Since the Sum of Squares (SS) column must add up, 3131 is found by subtracting 126 from 3257. Since , the total degrees of freedom are . Since there are 3 random samples or columns, the degrees of freedom for Between is 3 – 1 = 2.

Since the Degrees of Freedom (DF) column must add up, 17 = 29 – 2. The Mean Square (MS) column is found by dividing the SS column by the DF column. 63 is and 17 is . , and is compared with from the F table . We accept the null hypothesis if our computed is less than or equal to 3.59. Because our computed is less than the table , do not reject .

b. Is the mean for method 1 above the mean for method 2?

The data that we can use is repeated here.

Because the question does not include an equality, it must be an alternative hypothesis. Neither 77 nor 73.8 is a population mean, so these numbers cannot be in the null hypothesis, since a null hypothesis cannot include sample statistics. We thus conclude that our alternative hypothesis is , so that our null hypothesis must be the opposite,. The method taught in class assumes that the data comes from a Normal Distribution and that these are two samples from populations with equal variances. We could do this problem using the test ratio method, the critical value method or a one-sided confidence interval. Only the test ratio will be done here. For the general formulas used, see the formula table.

=

Our hypotheses are or Since this is a 1-sided alternate hypothesis, you cannot use 2-sided confidence intervals or tests. Note that 77 and 74.80 cannot appear in the hypotheses because they are sample means, not population means.

Test Ratio:. The 'do not reject' region is . Since our is below , do not reject the null hypothesis.

c. Is the mean for method 3 above 78?

The data that we can use is repeated here.

Note that because the question involves an inequality, it is an alternative hypothesis.

We use only a Test Ratio here, but remember that all such problems can be done using a critical value or a one sided confidence interval. . The 'do not reject' region is . Since our is below , do not reject the null hypothesis.

Remember, for comparing means or medians of 2 samples, if the parent distribution is Normal, use methods above for means, If the parent distribution is not Normal, use the Wilcoxon Signed Rank test or the Wilcoxon-Mann-Whitney test. If the samples are independent use Wilcoxon-Mann-Whitney test or a method that is appropriate for comparing means of independent samples. If the data is cross-classified, use a method for means of paired data or the Wilcoxon Signed Rank test.

2. Medians

a. Are the medians for these three methods the same?

Since this refers to medians instead of means and if we assume that the underlying distribution is not Normal, we use the nonparametric (rank test) analogue to ANOVA, the Kruskal-Wallis Test. Note that data is not cross-classified so that the Friedman Test is not applicable.

The null hypothesis is Columns come from same distribution or medians are equal.

The data are repeated in order. The second number in each column is the rank of the number among the 20 numbers in the three groups.

Method 1 Method 2 Method 3

55 2 57 3 50 1

70 6 65 4 68 5

74 7 78 9 77 8

82 12 80 10 81 11

88 15 89 16 83 13

93 19 42 84 14

61 91 17

92 18

94 20

107

Sums of ranks are given above. To check the ranking, note that the sum of the three rank sums is 61 + 42 + 107 = 210, that the total number of items is 6 + 5 + 9 = 20 and that the sum of the first numbers is Now, compute the Kruskal-Wallis statistic

.

If we try to look up this result in the (6 ,5 ,9) section of the Kruskal-Wallis table (Table 9) , we find that the problem is to large for the table. Thus we must use the chi-squared table with 2 degrees of freedom. The 'do not reject' region is . Since do not reject .

As we said many times -- If you are comparing means or medians of more than two samples, if the parent distribution is Normal use ANOVA, if it's not Normal, use Friedman or Kruskal-Wallis. If the samples are independent random samples use 1-way ANOVA or Kruskal Wallis. If they are cross-classified, use Friedman or 2-way ANOVA.

b. Are the medians for method 1 and method 2 the same?

The null hypothesis is Columns come from same distribution or medians are equal.

The data are repeated in order. The second number in each column is the rank of the number among the 11 numbers in the two groups.

Method 1 Method 2

55 1 57 2

70 4 65 3

74 5 78 6

82 8 80 7

88 9 89 10

93 11 28

38

Since this refers to medians instead of means and if we assume that the underlying distribution is not Normal, we use the nonparametric (rank test) analogue to comparison of two sample means of independent samples, the Wilcoxon-Mann-Whitney Test. Note that data is not cross-classified so that the Wilcoxon Signed Rank Test is not applicable.

.

We get and . Check: Since the total amount of data is 6 + 5 = 11 , 38 +28 must equal .They do.

For a 5% two-tailed test with and , Table 6 says that the critical values are 20 and 40. We accept the null hypothesis in a 2-sided test if the smaller of thee two rank sums lies between the critical values. The lower of the two rank sums, is between these values, so do not reject

c. Is the median for method 3 above 78?

There are two possible methods for doing this problem, the Sign Test and the Wilcoxon Signed rank test for paired data. .

The Wilcoxon Rank Test is the more powerful of the two methods and is shown here first. 'difference' below is the value given less the alleged median of 78. The 9 numbers are ranked according to absolute size but are accompanied by the sign of the difference.

difference rank

68 -10 5 -

83 5 3 +

50 -28 9 -

91 13 6 +

84 6 4 +

77 -1 1 -

94 16 8 +

81 3 2 +

92 14 7 +

We get and . Check: 30 +15 must equal .They do.

The lower of these two values is 15. According to the Wilcoxon Rank Test Table, the critical value for a 1-sided 5% test is 6 - that is we accept the null hypothesis if the smaller of the two rank sums is below 6. It isn't and we do not reject the null hypothesis.

To use the sign test for this problem, let be the population proportion above 78 and note that if the median is above 78, according to the outline, must be above .5. Since 6 items are above 78, and according to the binomial table for and is above 5%, we do not reject the null hypothesis.

3. Variances

a. Are the variances for method 1 and method 2 the same?

The numbers following are relevant.

Our hypotheses are and .

Since this is a 2-sided test, we should compute and compare it against and compute and compare it against . We do not reject the null hypotheses if both is below and is below. since is below 9.36, and below 1, while cannot be below 1, we cannot reject .

b. Is the variance for method 3 above 169?

The numbers following are relevant.

Our hypotheses are and . Since a null hypothesis must contain an equality, the question must be an alternative hypothesis. If we use a test ratio, the outline says to use . We do not reject the null hypothesis if this ratio is below .

Since 9.1065 is below 15.5073, we do not reject the null hypothesis.

4. Distributions - Goodness of Fit

The most commonly used method for checking for a particular distribution is the Chi-square method. However, if the sample is small and the parameters are known, use a Kolmogorov-Smirnov Method. If you wish to test for the Normal Distribution and the parameters are unknown, use the Lilliefors Method.

a. Does method 3 have a Normal distribution?

In this case, the population mean and variance are unknown, so we must use the Lilliefors method. We have previously found that the sample mean and variance are Our hypotheses are thus and .

The values of must be in order. The column is the cumulative distribution computed from the Normal table. is . is the Cumulativedivided by . For example, for the first line,

Cumulative

50 1 1 .1111 -2.86 .0021 .1090

68 1 2 .2222 -0.87 .1922 .0300

77 1 3 .3333 -0.22 .4129 .0796

81 1 4 .4444 0.07 .5279 .0835

83 1 5 .5556 0.22 .5871 .0315

84 1 6 .6667 0.28 .6103 .0564

91 1 7 .7778 0.79 .7852 .0076

92 1 8 .8889 0.87 .8078 .0811

94 1 9 1.0000 1.01 .8438 .1562

9

From the Lilliefors Table , the critical value for a 95% confidence level is .271. Since the largest number in is below this value, we do not reject .

b. Does method 3 have a Normal distribution with a mean of 85 and a standard deviation of 15?

Because the mean and variance are known and the sample is small, the only test that is practical is the Kolmogorov-Smirnov Test.

The values of must be in order. The column is the cumulative distribution computed from the Normal table. is . is the Cumulativedivided by . For example, for the first line,

Cumulative

50 1 1 .1111 -2.33 .0099 .1012

68 1 2 .2222 -1.20 .1151 .1071

77 1 3 .3333 -0.53 .2981 .0796

81 1 4 .4444 -0.67 .2514 .0352

83 1 5 .5556 -0.13 .4483 .1073

84 1 6 .6667 -0.07 .4721 .1946

91 1 7 .7778 0.40 .6554 .1224

92 1 8 .8889 0.46 .6772 .2117

94 1 9 1.0000 0.60 .7257 .2743

9

From the Kolmogorov-Smirnov Table , the critical value for a 95% confidence level is .430. Since the largest number in is not above this value, we do not reject .

5. Proportions

The following data has to do with proportions of a product that need repairs in the first three years of ownership:

Manufacturer 1Manufacturer 2Manufacturer 3

1. Are the proportions than need repairs the same for all three groups?

A test of equality of more than two proportions is a Chi-Square Test. or .

/ Manuf. 1 / Manuf. 2 / Manuf. 3 / Total /
Repaired / 45 / 27 / 18 / 90 / .036
Not repaired / 955 / 973 / 482 / 2410 / .964
Total / 1000 / 1000 / 500 / 2500 / 1.000

We get the items in by multiplying the column totals by . For example in the upper left-hand corner, 36.00 = .036(1000). was gotten from the total column by asking what proportion of the whole group was repaired or not repaired. The answer for the repaired group was

/ Manuf. 1 / Manuf. 2 / Manuf. 3 / Total /
Repaired / 36.00 / 36.00 / 18.00 / 90 / .036
Not repaired / 964.00 / 964.00 / 482.00 / 2410 / .964
Total / 1000.00 / 1000.00 / 500.00 / 2500 / 1.000

45 36 56.250

27 36 20.250

18 18 18.000

955 964 946.084

973 964 982.084

482 482 482.000

2500 2500 2504.668

We now compute . Since our degrees of freedom are (is number of rows and is number of columns), we test this against . Since our computed is less than from the table, we accept .

2. Is the proportion that needs repairs greater for manufacturer 1 than manufacturer 2?

This is a 2-sample test of proportions. Use , or a critical value for or a one-sided confidence interval for , where,, , , and .

or

Make a diagram. This is a one-sided test, so shade the 'reject' region, which is the area above 1.645. is above 1.645, so reject .

3. Is the proportion that need repairs for manufacturer 1 greater than 1%?

This is a one-sample test of a proportion.

Use where and Or use a critical value or confidence interval.

. Make a diagram. This is a one-sided test, so shade the 'reject' region, which is the area above 1.645. is above 1.645, so reject .

In the case of two samples, the following table might help.

Comparing 2 Samples / Paired Samples / Independent Samples
Location - Normal distribution.
Compare means. / Method D4 / Methods D1- D3
D1 Large samples only.
D2 ‘Small’ samples, equal variances assumed.
D3 ‘Small’ samples, equal variances not assumed.
Location - Distribution not Normal. Compare medians. / Method D5b (Wilcoxon Signed Rank Test) / Method D5a (Wilcoxon-Mann-Whitney)
Proportions / Method D6b (McNemar Test) / Method D6a
Variability - Normal distribution. Compare variances. / Method D7

We could add the following.

Comparing many Samples / Cross-classified Samples / Independent Samples
Location - Normal distribution, similar variances.
Compare means. / 2-way ANOVA / One-way ANOVA
Location - Distribution not Normal with similar variances. Compare medians. / Friedman Test / Kruskal-Wallis Test
Proportions / Chi-squared Test / Chi-squared Test
Variability -. Compare variances. / Bartlett Test (Normal Distribution)
Levene Test

A Flowchart relating all the methods discussed on this section is available at 252method.

6. Regression and Correlation

Since these were covered in the last unit, they are not reviewed here. You should be especially aware that coefficients are computed differently in simple and multiple regression and that you should be able to do t-tests of significance on regression coefficients.

7. Significance

This is a topic that was covered under hypothesis tests. Probably the first reference I made to this was even earlier when I said that a parameter is significant if it is not zero. I later said that a null hypothesis often says that a parameter or a difference between parameters is insignificant. If a result is significant we reject the null hypothesis.

To put this more generally, a result is (statistically) significant if it is larger or smaller than would be expected by chance alone. Thus in the case of a regression coefficient the measure of significance could be the p-value, which tells us the probability of getting our actual result or something more extreme if we assume that the population value of the coefficient is zero. If the p-value is small (below our significance level), then it is unlikely that our assumption about the coefficient is correct and we say that the coefficient is significant (or significantly different from zero). Of course, the various hypothesis tests that we have discussed here are also often ways of proving significance.

© 2004 Roger Even Bove

1