The Chi-Square Test of Independence Is Appropriate Here

1.)Sixty-four students in an introductory college economics class were asked how many credits they had earned in college, and how certain they were about their choice of major. Research question: At a = .01, is the degree of certainty independent of credits earned? Credits Earned Very Uncertain Somewhat Certain Very Certain Row Total 0–9 12 8 3 23 10–59 8 4 10 22 60 or more 1 7 11 19 Col Total 21 19 24 64

Observed (O) / 0-9 / 10-59 / 60 or more / Total
Very uncertain / 12 / 8 / 1 / 21
Somewhat Certain / 8 / 4 / 7 / 19
Very Certain / 3 / 10 / 11 / 2 24
Total / 23 / 22 / 19 / 64

The chi-square test of independence is appropriate here.

Null hypothesis: H0: degree of certainty and credits earned are independent.

Alternative hypothesis: H1: degree of certainty and credits earned are not independent.

Number of rows r = 3

Number of columns c = 3

Degrees of freedom = (r – 1)(c – 1) = (3-1)(3-1)=(2)(2) =4

Level of significance = α = 0.01

Critical value of chi-square = 13.278

Expected (E) / 0-9 / 10-59 / 60 or more / Total
Very uncertain / 7.546875 / 7.21875 / 6.234375 / 21
Somewhat Certain / 6.828125 / 6.53125 / 5.640625 / 19
Very Certain / 8.625 / 8.25 / 7.125 / 24
Total / 23 / 22 / 19 / 64
(O-E)^2/E / 0-9 / 10-59 / 60 or more / Total
Very uncertain / 2.627620342 / 0.084550866 / 4.394776003 / 21
Somewhat Certain / 0.201122712 / 0.981010766 / 0.327605609 / 19
Very Certain / 3.668478261 / 0.371212121 / 2.10745614 / 24
Total / 23 / 22 / 19 / 64

Critical region is Chi-square > 13.278

Chi-square = 14.76383282

Chi-square falls in the critical region.

Null hypothesis is rejected.

Conclusion: degree of certainty and credits earned are not independent.

2.)A student team examined parked cars in four different suburban shopping malls. One hundred vehicles were examined in each location. Research question: At a = .05, does vehicle type vary by mall location? (Data are from a project by MBA students Steve Bennett, Alicia Morais, Steve Olson, and Greg Corda.)

The chi-square test of independence is appropriate here.

Observed Frequency (O)

Vehicle Type Somerset Oakland GreatLakesJamestown Row Total

Car 44 49 36 64 193

Minivan 21 15 18 13 67

Full-sized Van 2 3 3 2 10

SUV 19 27 26 12 84

Truck 14 6 17 9 46

Col Total 100 100 100 100 400

Expected Frequency (E)

Vehicle Type / Somerset / Oakland / Great Lakes / Jamestown / Row Total
Car / 48.25 / 48.25 / 48.25 / 48.25 / 193
Minivan / 16.75 / 16.75 / 16.75 / 16.75 / 67
Full-sized Van / 2.5 / 2.5 / 2.5 / 2.5 / 10
SUV / 21 / 21 / 21 / 21 / 84
Truck / 11.5 / 11.5 / 11.5 / 11.5 / 46
Col Total / 100 / 100 / 100 / 100 / 400

To perform chi-square test the expected frequency of cells should be at least 5. So merge the rows corresponding to mini van and full sized van.

Observed (O)
Vehicle Type / Somerset / Oakland / Great Lakes / Jamestown / Row Total
Car / 44 / 49 / 36 / 64 / 193
Van / 23 / 18 / 21 / 15 / 77
SUV / 19 / 27 / 26 / 12 / 84
Truck / 14 / 6 / 17 / 9 / 46
Col Total / 100 / 100 / 100 / 100 / 400
Expected (E)
Vehicle Type / Somerset / Oakland / Great Lakes / Jamestown / Row Total
Car / 48.25 / 48.25 / 48.25 / 48.25 / 193
Van / 19.25 / 19.25 / 19.25 / 19.25 / 77
SUV / 21 / 21 / 21 / 21 / 84
Truck / 11.5 / 11.5 / 11.5 / 11.5 / 46
Col Total / 100 / 100 / 100 / 100 / 400
(O-E)^2/E
Vehicle Type / Somerset / Oakland / Great Lakes / Jamestown / Row Total
Car / 0.374352 / 0.011658 / 3.11010363 / 5.14119171 / 8.6373057
Van / 0.730519 / 0.081169 / 0.15909091 / 0.93831169 / 1.9090909
SUV / 0.190476 / 1.714286 / 1.19047619 / 3.85714286 / 6.952381
Truck / 0.543478 / 2.630435 / 2.63043478 / 0.54347826 / 6.3478261
Col Total / 1.838826 / 4.437547 / 7.09010551 / 10.4801245 / 23.846604

Null hypothesis: H0: Vehicle type do not vary by mall location. ( the two attributes are independent)

Alternative hypothesis: H1: Vehicle type vary by mall location. ( the two attributes are not independent)

Number of rows r = 4

Number of columns c = 4

Degrees of freedom = (r – 1)(c – 1) = (4-1)(4-1)=(3)(3) =9

Level of significance = α = 0.05

Critical value of chi-square =16.91897762

Chi-square = 23.84660365

Chi-square falls in the critical region.

Null hypothesis is rejected.

Conclusion:Vehicle type vary by mall location. ( the two attributes are not independent)

3.) High levels of cockpit noise in an aircraft can damage the hearing of pilots who are exposed to this hazard for many hours. A Boeing 727 co-pilot collected 61 noise observations using a handheld sound meter. Noise level is defined as “Low” (under 88 decibels), “Medium” (88 to 91 decibels), or “High” (92 decibels or more). There are three flight phases (Climb, Cruise, Descent). Research question: At a = .05, is the cockpit noise level independent of flight phase? Noise Level Climb Cruise Descent Row Total Low 6 2 6 14 Medium 18 3 8 29 High 1 3 14 18 Col Total 25 8 28 61

The chi-square test of independence is appropriate here.

Observed (O)
Noise level / Climb / Cruise / Descent / Total
Low / 6 / 2 / 6 / 14
Medium / 18 / 3 / 8 / 29
High / 1 / 3 / 14 / 18
Total / 25 / 8 / 28 / 61
Expected (E)
Noise level / Climb / Cruise / Descent / Total
Low / 5.737704918 / 1.836065574 / 6.426229508 / 14
Medium / 11.8852459 / 3.803278689 / 13.31147541 / 29
High / 7.37704918 / 2.360655738 / 8.262295082 / 18
Total / 25 / 8 / 28 / 61
To make expected frequency at least 5, combine the rows of Climb and Cruice.
Observed (O)
Noise level / Climb/Cruise / Descent / Total
Low / 8 / 6 / 14
Medium / 21 / 8 / 29
High / 4 / 14 / 18
Total / 33 / 28 / 61
Expected (E)
Noise level / Climb/Cruise / Descent / Total
Low / 7.573770492 / 6.426229508 / 14
Medium / 15.68852459 / 13.31147541 / 29
High / 9.737704918 / 8.262295082 / 18
Total / 33 / 28 / 61
(O-E)^2/E
Expected (E)
Noise level / Climb/Cruise / Descent / Total
Low / 0.023986942 / 0.028270325 / 0.052257267
Medium / 1.798242459 / 2.119357183 / 3.917599642
High / 3.380802561 / 3.984517304 / 7.365319865
Total / 5.203031962 / 6.132144812 / 11.33517677

Null hypothesis: H0: Noise level and phase are independent.

Alternative hypothesis: H1:Noiselevel and phase are not independent.

Number of rows r = 3

Number of columns c = 2

Degrees of freedom = (r – 1)(c – 1) = (3-1)(2-1)=(2)(1) =2

Level of significance = α = 0.05

Critical value of chi-square =5.991464547

Chi-square = 11.33517677

Chi-square falls in the critical region.

Null hypothesis is rejected.

Conclusion:Noise level and phase are not independent.

Can people really identify their favorite brand of cola? Volunteers tasted Coca-Cola Classic, Pepsi, Diet Coke, and Diet Pepsi, with the results shown below. Research question: At a = .05, is the correctness of the prediction different for the two types of cola drinkers? Could you identify your favorite brand in this kind of test? Since it is a 2 × 2 table, try also a two-tailed two-sample z test for p1 = p2 (see Chapter 10) and verify that z2 is the same as your chi-square statistic.Which test do you prefer? Why? Correct?

Regular Cola Diet Cola Row Total

Yes, got it right 7 7 14

No, got it wrong 12 20 32

Col Total 19 27 46

Chi-square test

Observed (O)
Regular Cola / Diet Cola / Row Total
Yes, got it right / 7 / 7 / 14
No, got it wrong / 12 / 20 / 32
Col Total / 19 / 27 / 46
Observed (O)
Regular Cola / Diet Cola / Row Total
Yes, got it right / 5.782608696 / 8.217391304 / 14
No, got it wrong / 13.2173913 / 18.7826087 / 32
Col Total / 19 / 27 / 46
(O-E)^2/E
Regular Cola / Diet Cola / Row Total
Yes, got it right / 0.256292906 / 0.180354267 / 0.436647173
No, got it wrong / 0.112128146 / 0.078904992 / 0.191033138
Col Total / 0.368421053 / 0.259259259 / 0.627680312

Null hypothesis: H0: Correctness of prediction and brand of cola are independent attributes. One cannot identify the favourite brand in this kind of test.

Alternative hypothesis: H1:Correctness of prediction and brand of cola are not independent attributes. One can identify the favourite brand in this kind of test.

Number of rows r = 2

Number of columns c = 2

Degrees of freedom = (r – 1)(c – 1) = (2-1)(2-1)=(1)(1) =1

Level of significance = α = 0.05

Critical value of chi-square =3.841459149

Chi-square = 0.627680312

Chi-square does not fall in the critical region.

Null hypothesis cannot be rejected.

Conclusion: the sample do not provide sufficient evidence conclude that favourite can be identified from this kind of test.

Testing of proportion

Let and respectively denote the proportion of people correctly identifying their favourite brand in the Regular cola group and the Diet cola group.

Null hypothesis: H0: = (the proportion of people identifying their brand is the same in both groups.)

Alternative hypothesis: H1: (the proportion of people identifying their brand is not the same in both groups.)

P1 = sample proportion in the first group = 7/19 = 0.368421053

P2 = sample proportion in the second group = 7/27 = 0.259259259

P = sample proportion in the combined group = 14/46 = 0.304347826

N1 = sample size of the first group= 19

N2 = sample size of the second group= 27

N = sample size of the combined group= 46

Z = 0.79226278

Critical value of Z = 1.959963985

Critical region is 1.959963985

Z does not fall in the critical region.

Null hypothesis cannot be rejected.

Conclusion: the sample do not provide sufficient evidence conclude that favourite can be identified from this kind of test.

The proportion test is better than the chi-square test because, we can do both two tail test and one tail tests. In the case of two tail tests both are equivalent.

To answer the question “Can people really identify their favorite brand of cola?”, the suitable test is H0: against H1: , where is the proportion of people correctly identifying their brand in the combined group.

Then

= 0.304347826

-2.653955211

The critical region is Z >1.644853627

The test statistic does not fall in the critical region.

So we cannot reject the null hypothesis.

Conclusion: There is not enough evidence to believe that people really can identify their favorite brand of cola.