Hypothesis Testing for Proportions

An example of a one-sample test for proportions:

We could run a test of hypothesis to see if our data for Red M&Mâ’s actually agrees with the advertised amount.

In our experiment, we had an average of 10.7 or 19% Red.

Variable | Obs Mean Std. Dev. Min Max

------+------

Red | 83 10.73494 3.679489 3 22

pRed | 83 .1916867 .0651044 .05 .38

Total | 83 56.06024 2.96051 44 63

Using the Steps in Hypothesis Testing:

1. State H0 (it ALWAYS has the = ) and HA (it’s sign depends on the question asked).

The null hypothesis is the ‘status quo’, so here it would be M&Mâ’s advertised percent. According to their web site, this is 20%. Since we just want to check this, we can test:

H0: pRed = 0.20 vs. HA: pRed ¹ 0.20

2. Determine the appropriate a-level (depending on the consequence of Type I and II errors).

We’ll discuss Type I and II errors later. For now, let’s use the usual 5%. This means if our observed proportion would happen less than 5% of the time if the true proportion is 20%, then we’re going to claim the advertised amount is NOT 20%.

3. Determine the appropriate test and calculate a p-value (use Labs=>Calculating Tests of Hypotheses and the flowchart to determine which Case).

So far we’ve only talked about z-tests (we converted the sample proportion, p, to a z-score and then found the probability using the Z Table). There are many other types of tests that we will discuss soon.

For this particular type of data, we will be using Case 6, the normal approximation for proportions.

NOTE: You must verify if the Conditions for the normal approximation hold before running this test, however. We did this on HW#4.5.

Case 6 says we need to give n and p. So, we put 0.05 in the box labelled alpha, 56 in the n box, 0.19 in the p box, 0.20 in the hyp value (hypothesized value, the number in H0) box, and finally click on Two-sided in the Test box. You just ignore the rest of the boxes because it isn’t used. The output and graph are:

Two-Sided Test for 0-1 proportion pi (approximate):

alpha = .05

Hypothesized value = .2

n = 56, p = .19

Z_calc = -.18708287

Critical values: -1.959964 , 1.959964

Fail to reject H_0

p-value = .85159566

4. State the conclusion (if p-value £ a, reject H0; otherwise, fail to reject) in terms of the hypothesis (answer the question asked).

The p-value is given in the output (see the last line) and in the last line of the title of the graph.

Since the p-value = 0.852 which is NOT < a = 0.05, we cannot reject our null hypothesis. In other words, our data is quite consistent with the advertised percent of Red M&Mâ’s. We cannot refute their claim.

Another example of a one-sample test for proportions:

What would happen if we only had one bag of M&Mâ’s, and it was the bag of 60 with only 5 Red? Is this too few Red’s? Are there really less Red M&Mâ’s than the advertised amount?

Still using the Steps in Hypothesis Testing:

1. State H0 (it ALWAYS has the = ) and HA (it’s sign depends on the question asked).

Again, the null hypothesis is the advertised percent, 20%. Now, however, we want to know if the true proportion is really less than the stated amount, so we should test:

H0: pRed = 0.20 vs. HA: pRed < 0.20

2. Determine the appropriate a-level (depending on the consequence of Type I and II errors).

Let’s stay with the standard 5% a-level.

3. Determine the appropriate test and calculate a p-value (use Labs=>Calculating Tests of Hypotheses and the flowchart to determine which Case).

For this data, we will again use Case 6, the normal approximation for proportions.

NOTE: 5 out of 60 is barely the necessary amount.

Case 6 says we need to give n and p. So, we put 0.05 in the box labelled alpha, 60 in the n box, 0.083 in the p box, 0.20 in the hyp value (hypothesized value, the number in H0) box, and finally click on Left-sided in the Test box. The output and graph are:

Left-Sided Test for 0-1 proportion pi (approximate):

alpha = .05

Hypothesized value = .2

n = 60, p = .083

Z_calc = -2.2656953

Critical value: -1.6448536

Reject H_0

p-value = .01173502

4. State the conclusion (if p-value £ a, reject H0; otherwise, fail to reject) in terms of the hypothesis (answer the question asked).

Since the p-value = 0.012 which IS < a = 0.05, we reject our null hypothesis and state that there is sufficient evidence to conclude that the true proportion of Red M&Mâ’s, pRed, is actually LESS than 20%. We can say that pRed is statisitically significantly less than 20%. In other words, our data disagrees with the advertised amount enough to dispute M&Mâ’s claim.

Why is there a difference?

First, you need to think of what the significance level, a, means and what a hypothesis test actually does. Remember, we said there was a distribution of Red M&Mâ’s, or the proportion of Red M&Mâ’s in a regular size bag. The significance level, a = 5%, means that we will be ‘throwing out’ 5% of this distribution and therefore WRONG 5% of the time. In hypothesis testing, we assume that the center of our normal curve is the hypothesized value (here it’s 20%) and calculate where our data falls on this curve. We just happened to get a bag out there on the tail! Look at the interpretation of the p-value:

If the true proportion of Red M&Mâ’s, pRed, is actually 20%, we would see 8.3% or less Red M&Mâ’s only

1.2% of the time.

This sample wouldn’t happen very often, but it is still possible.

An example of a two-sample test for proportions:

We could also test whether the proportions of Blue and Orange are the same as M&Mâ claims. Our sample data says that there are only 4.9 Blue M&Mâ’s and 6.6 Orange on average. Does this means the true proportions are different?

Blue | 83 4.903614 2.588033 0 14

Orange | 83 6.566265 3.151433 0 14

pBlue | 83 .0872289 .0438488 0 .24

pOrange | 83 .1177108 .055619 0 .26

Using the Steps in Hypothesis Testing:

1. State H0 (it ALWAYS has the = ) and HA (it’s sign depends on the question asked).

We don’t really care what the true proportions are, we just want to see if they are the same if there really ARE less Blue M&Mâ’s than Orange ones. So we test:

H0: pBlue = pOrange vs. HA: pBlue < pOrange

2. Determine the appropriate a-level (depending on the consequence of Type I and II errors).

Again, we’ll the usual 5%.

3. Determine the appropriate test and calculate a p-value (use Labs=>Calculating Tests of Hypotheses and the flowchart to determine which Case).

Case 11 is the only case for testing two proportions, but we must again check the assumptions. (Remember, there was a question as to whether or not the Blue was normal.)

Case 11 says we need to give n1, n2, p1 and p2. Using 0.05 for alpha, we put 56 in the n1 and n2 box, 0.087 in the p1 box, 0.118 in the p2 box, 0 in the hyp value (if the two p’s are the same, their difference would be ZERO) box, and finally click Left-sided in the Test box. I put 4.9/56 and 6.6/56 for p1 and p2, but it works the same way.

Left-Sided Test for Difference of proportions pi1-pi2:

alpha = .05

Hypothesized value = 0

n1 = 56, n2 = 56, p1 = .0875, p2 = .11785714

Z_calc = -.52920749

Critical value: -1.6448536

Fail to reject H_0

p-value = .29833076

4. State the conclusion (if p-value £ a, reject H0; otherwise, fail to reject) in terms of the hypothesis (answer the question asked).

Since the p-value = 0.298 which is NOT < a = 0.05, we cannot reject our null hypothesis. In other words, our data agrees with the statement that the proportion of Blue and Orange M&Mâ’s are the same.

The interpretation of the p-value here is: About 30% of the time we will see the proportion of Blue M&Mâ’s at least this much smaller than the proportion of Orange.

So what if we wanted to check all of the colors and see if they agree with what M&Mâ says it should be?

To test multiple proportions, we must run a c2 Test. There is an Excel file, ‘wstataq\sp01\Chi2oneway.xls’ to help. You just enter the hypothesized proportions and the actual (observed) counts, and it does all the calculations.

Example of One-Way Chi-Square Test
Null Hypothesis: pBrown = 0.30, pYellow = 0.20, pRed = 0.20, pGreen = 0.10, pBlue = 0.10, pOrange = 0.10
Color / Ho: pi / Obs Cnt / Exp Cnt / (E-O)^2/E
Brown / 0.3 / 18 / 16.8 / 0.085714
Yellow / 0.2 / 10 / 11.2 / 0.128571
Red / 0.2 / 11 / 11.2 / 0.003571
Green / 0.1 / 5 / 5.6 / 0.064286
Blue / 0.1 / 6 / 5.6 / 0.028571
Orange / 0.1 / 6 / 5.6 / 0.028571
Total / 1 / 56 / 56 / 0.339286
chi-square / 0.339286
df / 5
p-value / 0.996838

Since the p-value = 0.997, it seems that our data agrees with M&Mâ.

We use the c2 for this type of test because we’re calculating a different test statistic. Here, we’re looking at the difference between what happened (the observed counts) and what we would expect if the null hypothesis is true (the expected counts).

NOTE: Remember, since the proportion is just the count/total, (p=x/n), then the expected count is just the true proportion*total (xexp = pn).

If the difference is small (0 would be the minimum in absolute difference), then the null is plausible and the p-value is large. A large difference would only happen rarely, so if it is so large that the chance of it happening (the p-value) is small (< a), we reject the null.

There is also an Excel file, ‘wstataq\sp01\Chi2twoway.xls’ which will run the two way table in Chapter 6 of Mind on Statistics.