252chisq 2/29/08 (Open this document in 'Outline' view!)
E. CHI-SQUARED AND RELATED TESTS.
These tests are generalizations of the one-sample and two-sample tests of proportions. A test of Goodness of Fit is necessary when a single sample is to be divided into more than two categories. A Test of Homogeneity is needed when one wants to compare more than one sample. A test of Independence is used to see if two variables or categorizations are related, but is formally identical to a test of homogeneity.
1. Tests of Homogeneity and Independence
Two possible null hypotheses apply here. The observed data is indicated by the expected data by .
The numbers are obviously identical in these two cases. In each case the expected values
are done the same way. There are rows, columns and cells. . Each cell gets . For example, for the upper left corner the expected value is
The formula for the chi-squared statistic is or . The first of these two formulas is shown below. For an explanation of the equivalence of these two formulas, the reason why the degrees of freedom are as given below, and to relate the chi-squared test to a test of proportions, see 252chisqnote.
10.0000 10 0.0000 0.0000 0.00000
8.0000 5 3.0000 9.0000 1.12500
12.0000 15 -3.0000 9.0000 0.75000
13.3333 15 -1.6667 2.7778 0.20833
10.6667 10 0.6667 0.4445 0.04167
16.0000 15 1.0000 1.0000 0.06250
13.3333 15 -1.6667 2.7779 0.20834
10.6667 15 -4.3333 18.7775 1.76038
16.0000 10 6.0000 36.0000 2.25000
13.3333 10 3.3333 11.1109 0.83332
10.6667 10 0.6667 0.4445 0.04167
16.0000 20 -4.0000 16.0000 1.00000
150.0000 150 0.0000 8.28121
The degrees of freedom for this application are .
The most common test is a one-tailed test on the grounds that the larger the discrepancy that occurs between and , the larger will be . If our significance level is 5%, compare to Since our value of this sum is less than
the table chi-squared, do not reject the null hypothesis.
Note: Rule of thumb for .
All values of should be above 5 and we generally combine cells to make this so. However a number is acceptable in if i) Our computed turns out to be less than or, (ii) The particular value of makes a very small contribution to , relative to the value of the total.
Note: Marascuilo Procedure.
The Marascuilo procedure says that, for 2 by c tests, if (i) equality is rejected and
(ii) , where and represent 2 groups, the chi - squared has degrees of freedom and the standard deviation is , you can say that you have a significant difference between and . This is equivalent to using a confidence interval of
Example: Pelosi and Sandifer give the data below for satisfaction with local phone service classified by type of company providing the service. 1) Is the proportion of people who rate the phone service as excellent independent of the type of company? 2) If it is not, test for a difference in the proportion who rate their service as excellent against the best-rated provider. Remember that this is a test of equality of proportions.
Service 1 is a long distance company, Service 2 is a local phone company, Service 3 is a power company, Service 4 is CATV (cable) and Service 5 is Cellular. is thus the proportion of long distance company customers that rate their service as excellent.
Not all ps equal.
Solution: Set up the O table. To get the number that rate service as excellent for long distance, note that . But this must be a whole number, so round it to 246. The number that do not rate it as excellent is . This gives us our first column. is also computed for use later.
/ Long Dist / Local Ph / Power / CATV / Cellular / Total /Excellent / 264 / 444 / 131 / 215 / 198 / 1252 / .2296
Not / 1394 / 1318 / 485 / 431 / 572 / 4200 / .7704
Sum / 1658 / 1762 / 616 / 646 / 770 / 5452 / 1.0000
Proportion
Excellent / .1592 / .2520 / .2127 / .3328 / .2571
/ .0000807 / .0001070 / .0002718 / .0003437 / .0002481
Note that in addition to computing the overall proportion of excellent and not excellent service (.2296 and .7704) , the ‘proportion excellent’ has been computed for each type of service as well as the variance used in the confidence interval formula. If we apply the proportions in each row to the column sums we get the following expected values.
/ Long Dist / Local Ph / Power / CATV / Cellular / Total /Excellent / 380.68 / 404.56 / 141.43 / 148.32 / 176.79 / 1252 . / .2296
Not / 1277.32 / 1357.44 / 474.57 / 497.68 / 593.21 / 4200 / .7704
sum / 1658 / 1762 / 616 / 646 / 770 / 5452 / 1.0000
The chi-squared test follows.
Row
1 380.68 264 116.677 13613.5 35.7612
2 1277.32 1394 -116.677 13613.5 10.6578
3 404.56 444 -39.445 1555.9 3.8459
4 1357.44 1318 39.445 1555.9 1.1462
5 141.43 131 10.434 108.9 0.7697
6 474.57 485 -10.434 108.9 0.2294
7 148.32 215 -66.678 4446.0 29.9755
8 497.68 431 66.678 4446.0 8.9335
9 176.79 198 -21.208 449.8 2.5441
10 593.21 572 21.208 449.8 0.7582
5452.00 5452 0.000 94.622
The degrees of freedom are and , so we reject the null hypothesis and say that there is a difference between the proportions that rate their service as excellent. Since the highest proportion satisfied was with CATV we compare the proportions with the proportion calculated for CATV using the confidence interval formula above.
Long distance
Local Phone
Power
Cellular
Notice that the absolute size of the error term is always smaller than the absolute size of the difference in proportions, so that we can say that all of these differences are significant. Though I have not checked it, I doubt, if we compare all other proportions with the proportion saying cellular service is excellent we would get such strong results.
2. Tests of Goodness of Fit
a. Uniform Distribution
Let us pool the data above, that is, treat it all as if it were one sample, and ask if it is uniformly distributed.
50 50 0 0 0
50 40 10 100 2
50 60 -10 100 2
150 150 0 4
Since there are 3 numbers here, there are 2 degrees of freedom. Since 4 is less than , we cannot reject the null hypothesis. An easier way to do this is to compute . Remember
50 50 50
50 40 32
50 60 72
150 150 154
-150
4
For a combined Chi-square test of both uniformity and homogeneity see 252chisqx1
b. Poisson Distribution
Example:
I believe that there is almost a daily accident on my corner. To make this into a testable hypothesis, let us sat that I believe that the distribution is Poisson with a parameter of 0.8 and that I observe the numbers of accidents shown below over 200 days. For example there are 100 days with no accidents, 60 days with 1 etc.
To get , I look up frequencies
0 100 .4493 89.86 on the Poisson table and
1 60 .3595 71.90 multiply by , using the
2 30 .1438 28.76 formula
3 6 .0383 7.66 Unfortunately, I cannot
4 0 .0077 1.54(<5) work with as it appears here.
5 4 .0012 0.24(<5) I must have each at least
6 0 .0002 0.04(<5) 5. To fix the problem, I add
7+ 0 .0000 0.00(<5) the smallest cells together to
200 1.0000 200.00 increase to 5 or more.
0 100 89.86 111.28 Since I did not estimate the mean
1 60 71.90 50.07 of 0.8 from the data, I have 3
2 30 28.76 31.29 degrees of freedom.
3+ 10 9.48 10.56
200 200 203.20 so I do not
200.00 reject the null hypothesis.
3.20
But what if my hypotheses are simply ? Then I would have to estimate the mean from the data. Looking at the and columns I calculate . I would still use the Poisson distribution with a parameter of 0.8 unless I had a computer handy to compute it with a parameter of 0.79, but my degrees of freedom are now 3 - 1 = 2, because I used the data to estimate a parameter.
c. Normal Distribution
A common way to set up a test of normality is to group data starting at the mean and ending each group one-half of a standard deviation from the mean. One can proceed outward from the mean until four or five groups have been sectioned off in each direction. For example, if our null hypothesis is that , we can start at 100 and let the width of each group be one half of or 5. The groups would be 100-105, 105-110, etc. going up, and 95-100, 90-95, etc. going down. Then for the highest number in each interval, compute . For example, for the interval 90-95 compute . Then use the normal distribution to compute .
. For example Then, to find the frequency of the interval, subtract this from the for the previous interval. An example of calculating this way is shown below.
interval
-80 -2.0 .0228 .0228 22.8
80-85 -1.5 .0668 .0440 44.0
85-90 -1.0 .1587 .0919 91.9
90-95 -0.5 .3085 .1498 149.8
95-100 0.0 .5000 .1915 191.5
100-105 0.5 .6915 .1915 191.5
105-110 1.0 .8413 .1498 149.8
110-115 1.5 .9332 .0919 91.9
115-120 2.0 .9772 .0440 44.0
120- 1.0000 .0228 22.8
For smaller values of we may find that some numbers in are less than 5, so that we have to combine some intervals. In the above example the degrees of freedom for are 10 - 1 = 9 if the mean and variance are known. If they both had to be computed from data the degrees of freedom would be reduced by 2 to 7.
3. Kolmogorov-Smirnov Test
a. Kolmogorov-Smirnov One-Sample Test
This is a more powerful test of goodness of fit than the Chi-Squared test. Unfortunately, it can only be used when the distribution in the null hypothesis is totally specified. For example, if we wanted to do the test for Poisson(0.8) above, we would look up the cumulative distribution for Poisson(0.8) and proceed as below. Note that this would not work if our hypothesis was that the distribution was Poisson without the mean specified.
0 100 .50 .50 .4493 .0507
1 60 .30 .80 .8088 .0088
2 30 .15 .95 .9526 .0026
3 6 .03 .98 .9909 .0109
4 0 .00 .98 .9986 .0106
5 4 .02 1.00 .9998 .0002
6 0 .00 1.00 1.0000 .0000
7+ 0 .00 1.00 1.0000 .0000
200 1.00
The maximum difference is , which must be checked against the Kolmogorov-Smirnoff table for . According to the table, for , the critical value is . Since is less than .0962, accept the null hypothesis.
b. Lilliefors Test.
Because the Kolmogorov-Smirnov Test is so limited in application, it proved advantageous to develop a special version of that test to use to test for a normal distribution when the mean and variance are unknown. Once a sample mean and variance are found, this test is identical to the K-S Test except for the use of a special table.
Problem E9: Is the following data normal?
420, 440, 445, 450, 460, 475, 480, 500, 520, 530
Solution: Assume The only practical method is the Lilliefors method. Question: Why is Chi-squared impractical and Kolmogorov-Smirnov impossible?
The numbers must be in order before we begin computing cumulative probabilities! Checking the data we find that and . We compute . (This is really a .) is the cumulative distribution, gotten from the Normal table by adding or subtracting 0.5. comes from the fact that there are 10 numbers, so that each number is one-tenth of the distribution.
For and the critical value from the Lilliefors table is 0.2616. Since the largest deviation here is .1293, we do not reject .
Remember that the Lilliefors method is a specialized version of the KS method used only in situations where you are testing for a Normal distribution and using a sample mean and standard deviation estimated from the data. The KS method can only be used in situations where the null hypothesis including parameters is specified in advance. A Chi-squared test of goodness of fit is usually considered a large sample test, but can be adjusted for estimation of parameters.
420 / -1.45 / 0.1000 / .0735 / .0265440 / -0.89 / 0.2000 / .1867 / .0133
445 / -0.75 / 0.3000 / .2266 / .0734
450 / -0.61 / 0.4000 / .2702 / .1291
460 / -0.33 / 0.5000 / .3707 / .1293
475 / 0.08 / 0.6000 / .5319 / .0681
480 / 0.22 / 0.7000 / .5871 / .1129
500 / 0.78 / 0.8000 / .7823 / .0177
520 / 1.34 / 0.9000 / .9099 / .0099
530 / 1.61 / 1.0000 / .9463 / .0537
©2002 Roger Even Bove
4