Type I and Type II Errors and Power

Type I and Type II Errors and Power

Statistics 215

Type I and Type II errors and Power

Case Study: After receiving numerous requests from students, the Dining Hall is considering installing a new brick wood-fired pizza oven for baking genuine Neapolitan pizza. (Evidently, the only other location in Minnesota where one can get genuine Neapolitan-style pizza is Punch Pizza in St. Paul.) The oven is expensive and the Dining Hall intends to poll the student body to determine if there is at least 40% support for such a costly purchase. They intend to take a random sample of 25 students in order to determine student support.

1. Set up the appropriate hypotheses for a statistical test, clearly identifying the population and parameter.

2. What would be the consequences of making a Type I and Type II error?

3. Draw a picture of the sampling distribution of the sample statistic () under the null hypothesis (if Ho is true). Find the standard deviation of the sampling distribution. Make sure to label the x-axis.

4. The hypothesis test will be conducted at the α = 2.5% level. By using the 68-95-99.7 rule we see that this test will reject the null hypothesis if the observed value of the statistic (p-hat) is greater than 0.60. (Show this on your picture.) In other words, the decision rule for this test is:

Reject Ho,if > 0.6

Do not reject Hoif ≤ 0.6.

Since at this stage we are assuming that p = 0.40 and the null hypothesis is true, the rejection region to the right of 0.60 represents the region where we incorrectly reject the null hypothesis. That is, we commit a Type I error. Thus the area to the right of 0.60 is Prob(Type I error). Notice that we could decrease the probability of Type I error by moving the vertical line to the right, that is, by making a more stringent decision rule for when to reject Ho.

5. Suppose that the Dining Hall staff (before they conduct their poll) suspects that in fact 50% of the student body would support the change. That is, they suspect that p = 0.50. We will look at the consequences of this. First of all, if that were in fact the case, then we should expect that when we conduct the hypothesis test, we will reject the null hypothesis (since it’s in fact false).

Is that, however, what will actually happen?

6. Here you will simulate the random survey and the hypothesis test under the assumption that there is 50% support for the change. Start your simulation at the line number given at the top of this handout. Take a sample of size 25, where each student has a 50% chance of SUPPORT and a 50% chance of NON-SUPPORT. Compute from your sample. Now use the decision rule on the other side of the page to determine whether or not you will accept or reject the null hypothesis.

For my survey, = ______therefore I will [ reject, not reject ] the null hypothesis.

7. Let’s take a closer look at what’s going on. If the null hypothesis is false and p = 0.50 draw the sampling distribution of. Draw a vertical line through 0.60. By our decision rule, if the observed value is to the left of the line, we do not reject Ho, and if it is to the right of the line we do reject Ho. Mark your observed value on the horizontal axis.

The observed values to the left of the line represent outcomes where we don’t reject the null hypothesis. But the null hypothesis is false. Thus, these points represent Type II error. And the probability of Type II error P(Type II error) is represented by the shaded region to the left of the line. We will discuss the inverse relationship between P(Type I error) and P(Type II error).

If we decrease one, we increase the other.

8. The power of the test is defined as β = 1 – P(Type II error) = P(Rejecting Ho when Ho is false). We would certainly like to reject Ho when it is false, so we want the power to be as large as possible. On your graph, the power is the area to the right of the vertical line. Many statistical studies, such as those done for the government, insist on a minimum power of 80%.

9. The problem with the Dining Hall study is that there is not enough power to detect an actual p value of 50%. As we saw in the simulation, most of our tests failed to reject. How can we increase power? The most direct way is to increase the sample size. In many statistical studies, a big question is determining the appropriate sample size that will give sufficient power. This is a more advanced topic that we won’t cover in this class. But let’s look what happens when we increase the sample size to 100. . . .