Chi-Square Test

The Chi-Square Test procedure tabulates a variable into categories and tests the hypothesis that the observed frequencies do not differ from their expected values.

Chi-Square Test allows you to:

  • Include all categories of the test variable, or limit the test to a specific range.
  • Use standard or customized expected values.
  • Obtain descriptive statistics and/or quartiles on the test variable

Example :

A large hospital schedules discharge support staff assuming that patients leave the hospital at a fairly constant rate throughout the week. However, because of increasing complaints of staff shortages, the hospital administration wants to determine whether the number of discharges varies by the day of the week.

This example uses the file dischargedata.sav[1]. Use Chi-Square Test to test the assumption that patients leave the hospital at a constant rate

Each case is a day of the week, and to perform the chi-square test, you must first be weight the cases by frequency of patient discharge.

To weight the cases, from the Data Editor menus choose:

Data
Weight Cases...

Select Weight cases by.

Select Average Daily Discharges as the frequency variable.

Click OK.

The cases are now weighted by frequency of patient discharge.

To begin the analysis, from the menus choose:

Analyze
Nonparametric Tests
Chi-Square...

Select Day of the Week as the test variable.

Click OK

  • Here, the observed frequency for each row is simply the average number of patients discharged per day across the year. Last year, for example, the hospital discharged an average of 589 patients per week--44 on Sunday, 78 on Monday, etc.

The expected value for each row is equal to the sum of the observed frequencies divided by the number of rows in the table. In this example, there were 589 observed discharges per week, resulting in about 84 discharges per day.

Finally, the residual is equal to the observed frequency minus the expected value. The table shows that Sunday has many fewer, and Friday, many more, patient discharges than an "every day is equal" assumption would expect.

  • The obtained chi-square statistic equals 29.389. This is computed by squaring the residual for each day, dividing by its expected value, and summing across all days.
  • The term df represents degrees of freedom. In a chi-square test, df is the number of expected values that can vary before the rest are completely determined. For a one-sample chi-square test, df is equal to the number of rows minus 1.

Asymp. Sig. is the estimated probability of obtaining a chi-square value greater than or equal to 29.389 if patients are discharged evenly across the week. The low significance value suggests that the average rate of patient discharges really does differ by day of the week.

  • The term df represents degrees of freedom. In a chi-square test, df is the number of expected values that can vary before the rest are completely determined. For a one-sample chi-square test, df is equal to the number of rows minus 1.
  • Asymp. Sig. is the estimated probability of obtaining a chi-square value greater than or equal to 29.389 if patients are discharged evenly across the week. The low significance value suggests that the average rate of patient discharges really does differ by day of the week.

Remark :

By default, the Chi-Square Test procedure builds frequencies and calculates an expected value based on all valid values of the test variable. However, you might want to restrict the range of the test to a contiguous subset of the available values. As the next example shows, the procedure easily allows for this.

Example :

The hospital requests a follow-up analysis: Can staff be scheduled assuming that patients discharged on weekdays only (Monday through Friday) leave at a constant daily rate?

To rerun the analysis, recall the Chi-Square Test dialog box.

Select Use specified range.

Type 2 as the lower value and 6 as the upper value.

Click OK.

The test range is now restricted to Monday through Friday

  • On average, about 92 patients were discharged from the hospital on weekdays.

The Residual column shows that Mondays were lighter and Fridays were heavier than this, but, of course, we can't tell just by looking at them whether these differences are significant

On four degrees of freedom, the obtained chi-square statistic of 5.822 is not close to any reasonable level of significance. We cannot reject the hypothesis that patients were discharged at a rate of about 92 per weekday.

Remark :

The Chi-Square Test procedure is typically used to test observed frequencies against a single expected value that is equal for all rows. However, the distribution of values may not follow that pattern. In genetics, for example, you might expect to see a trait dominant in 75% of the population and recessive in the other 25%. The Chi-Square Test procedure allows you to specify a customized set of expected values, thereby permitting a wide variety of models to be tested.

Example :

A clothing manufacturer tries first-class postage for direct mailings, hoping for faster responses than with bulk mail. Order-takers record how many weeks after the mailing each order is taken.

This information is collected in the file mailresponse.sav[2]. Use Chi-Square Test to determine whether the percentage of orders by week between the two methods differs

The data are tabulated by week and must first be weighted by the frequency of first-class mail response.

To weight the data, from the Data Editor menus choose:

Data
Weight Cases...

Select Weight cases by.

Select First Class Mail as the frequency variable.

Click OK.

Now the data are weighted and ready to be analyzed.

Select Weight cases by.

Select First Class Mail as the frequency variable.

Click OK.

Now the data are weighted and ready to be analyzed.

To begin the chi-square analysis, from the menus choose:

Analyze
Nonparametric Tests
Chi-Square...

Select Week of Response as the test variable.

The expected frequencies are the response percentages that the firm has historically obtained with bulk mail.

Select Values in the Expected Values group.

Type 6 as the first expected value, then click Add.

Repeat this process, consecutively adding the values 15.1, 18, 12, 11.5, 9.8, 7, 6.1, 5.5, 3.9, 2.1, and 2.

After the 12 values have been entered, click OK.

  • The observed response percentages appear in the Observed N column. Because of its use as a weighting variable, these are the actual values of the variable First Class Mail.
  • The expected response percentages appear in the Expected N column. Note that the values in this column are simply the historical percentages for bulk mail; you entered these in the Chi-Square Test dialog box.
  • The differences between first class and bulk mail response percentages appear in the Residual column.
  • The firm hoped that first-class mail would result in quicker customer response. The first two weeks do show a difference in that direction, of four and seven percentage points, respectively

The question is whether the differences between the two distributions overall are large enough to make a difference

On 11 degrees of freedom, the obtained chi-square statistic (12.249) is not statistically significant. This particular promotion did not result in response times that were significantly different from standard bulk mail.

Remark :

The Chi-Square Test procedure is useful when you want to compare a single sample from a polychotomous variable to an expected set of values. The procedure tabulates this variable into a set of frequencies and tests this observed set against either a common expected value or a customized set of expected values. The entire range of the test variable is used by default; however, its range may be restricted to any set of contiguous values. Additionally, descriptive statistics and/or quartiles can be requested.

  • If your variable has only two outcomes, you can alternatively use the Binomial Test procedure.

The crosstabulation table is the basic technique for examining the relationship between two categorical (nominal or ordinal) variables, possibly controlling for additional layering variables.

The Crosstabs procedure offers tests of independence and measures of association and agreement for nominal and ordinal data. Additionally, you can obtain estimates of the relative risk of an event given the presence or absence of a particular characteristic.

Example :

In order to determine customer satisfaction rates, a retail company conducted surveys of 582 customers at 4 store locations. From the survey results, you found that the quality of customer service was the most important factor to a customer's overall satisfaction. Given this information, you want to test whether each of the store locations provides a similar and adequate level of customer service.

The results of the survey are stored in satisf.sav.[3] Use the Crosstabs procedure to test the hypothesis that the levels of service satisfaction are constant across stores.

To run a Crosstabs analysis, from the menus choose:

Analyze
Descriptive Statistics
Crosstabs...

Select Store as the row variable.

  • Select Service satisfaction as the column variable.

Click Statistics.

Select Chi-square, Contingency Coefficient, Phi and Cramer's V, Lambda, and Uncertainty coefficient.

Click Continue.

Click OK in the Crosstabs dialog box.

The crosstabulation shows the frequency of each response at each store location. If each store location provides a similar level of service, the pattern of responses should be similar across stores

At each store, the majority of responses occur in the middle

Store 2 appears to have fewer satisfied customers. Store 3 appears to have fewer dissatisfied customers. From the crosstabulation alone, it's impossible to tell whether these differences are real or due to chance variation. Check the chi-square test to be sure.

The chi-square test measures the discrepancy between the observed cell counts and what you would expect if the rows and columns were unrelated.

The two-sided asymptotic significance of the chi-square statistic is greater than 0.10, so it's safe to say that the differences are due to chance variation, which implies that each store offers the same level of customer service.

However, not all customers surveyed actually had contact with a service representative. The ratings from these customers will not reflect the actual quality of service at a store, so you further cross-classify by whether they had contact with a service representative

Recall the Crosstabs dialog box.

Select Contact with employee as a layer variable.

Click OK in the Crosstabs dialog box.

The crosstabulation now splits the previous crosstabulation into two parts.

Now that the customers who didn't have contact are sorted out, there appears to be a significant association between store 2 and low service satisfaction. Check the chi-square test to be sure

The chi-square test is performed separately for customers who did and did not have contact with a store representative.

The significance value of the test for customers who did not have contact is 0.052. This is suggestive, but not conclusive, evidence of a relationship between Store and Service satisfaction for these customers.

While not directly related to the quality of service given by your employees, you might consider a separate analysis of these customers to determine if there is some other factor that accounts for this relationship.

The significance value of the test for customers who had contact with an employee is 0.012. Since this value is less than 0.05, you can conclude that the relationship observed in the crosstabulation is real and not due to chance. While the chi-square test is useful for determining whether there is a relationship, it doesn't tell you the strength of the relationship. Symmetric measures attempt to quantify this.

Symmetric measures are reported separately for customers who did and did not have contact with a store representative. These measures are based on the chi-square statistic

Phi is the ratio of the chi-square statistic to the weighted total number of observations. It is the most "optimistic" of the symmetric measures, and unlike most association measures, does not have a theoretical upper bound when either of the variables has more than two categories.

Cramer's V is a rescaling of phi so that its maximum possible value is always 1. As the number of rows and columns increases, Cramer's V becomes more conservative with respect to phi.

The contingency coefficient takes values between 0 and SQRT[(k-1)/k], where k = the number of rows or columns, whichever is smaller. It becomes more conservative with respect to phi as the associations between the variables become stronger. The significance values of all three measures are 0.012, indicating a statistically significant relationship. However, the values of all three measures are under 0.3, so although the relationship is not due to chance, it is also not very strong. While these measures give some sense of the strength of the association, they do not, in general, have an intuitive interpretation. To develop a clearer sense of this, look at the directional measures.

Directional measures quantify the reduction in the error of predicting the row variable value when you know the column variable value, or vice versa. Each measure simply has a different definition of "error."

Lambda defines error as the misclassification of cases, and cases are classified according to the modal (most frequent) category. Tau defines error as the misclassification of a case, and cases are classified into category j with probability equal to the observed frequency of category j. The uncertainty coefficient defines error as the entropy, or P(category j) * ln(P(category j)) summed over the categories of the variable. The uncertainty coefficient is also known as Theil's U. For customers who had contact, the Goodman and Kruskal's tau value of 0.031 with Store dependent means that there is a 3.1% reduction in misclassification. The other measures report equally small values, indicating that the association between Store and Service satisfaction is almost solely due to the poor service at store 2.

Using Crosstabs' nominal-by-nominal measures, you initially found that each store offers a similar quality of customer service. However, once customers who did not have contact with a service representative were separated out, a relationship between Store and Service satisfaction emerged. Fortunately, that relationship is rather weak, indicating that while there is a measurable difference in the service quality between stores, it is likely due to the poor service of a single store rather than a more serious company-wide variation in store image.

As a result, employees at store 2 received customer service training to bring their quality of service in line with the other stores

Example :

A retail company that conducted a customer satisfaction survey is interested in learning how shopping frequency is related to overall satisfaction levels. Since the categories of both of these variables are ordered, you can make use of measures that quantify the strength and determine the sign (positive or negative) of the association.

The results of the survey are collected in satisf.sav[4]. Use Crosstabs to obtain ordinal measures of the association between Shopping frequency and Overall satisfaction.

To run a Crosstabs analysis, from the menus choose:

Analyze
Descriptive Statistics
Crosstabs...

Click Reset to restore the default settings.

Select Shopping frequency as the row variable.

Select Overall satisfaction as the column variable.

Click Statistics.

Select Gamma, Somer's d, Kendall's tau-b, and Kendall's tau-c.

Click Continue.

Click OK in the Crosstabs dialog box.

These selections produce a crosstabulation table and measures of ordinal-by-ordinal association for Shopping frequency by Overall satisfaction.

These selections produce a crosstabulation table and measures of ordinal-by-ordinal association for Shopping frequency by Overall satisfaction.

The crosstabulation shows no clear pattern. If any exists, it may be that people who shop more often are more satisfied

Symmetric and directional measures of ordinal association are based on the idea of accounting for concordance versus discordance. Each pairwise comparison of cases is classified as one of the following.

  • A pairwise comparison is considered concordant if the case with the larger value in the row variable also has the larger value in the column variable. Concordance implies a positive association between the row and column variables.
  • A pairwise comparison is considered discordant if the case with the larger value in the row variable has the smaller value in the column variable. Discordance implies a negative association between the row and column variables.
  • A pairwise comparison is considered tied on one variable if the two cases take the same value on the row variable, but different values on the column variable (or vice versa). Being tied on one variable implies a weakened association between the row and column variables.
  • A pairwise comparison is considered tied if the two cases take the same value on both the row and column variables. Being tied implies nothing about the association between the variables.

The measures differ in how they treat each type of comparison.