Brahler

Page 1 of 7

Project 1: A Comparison of Actual Probability Coverage of Classic Confidence Intervals to Bootstrap Confidence Intervals

Todd A. Brahler

Project 1: A Comparison of Actual Probability Coverage of Classic Confidence Intervals to Bootstrap Confidence Intervals

The general form of a large-sample confidence interval for a population parameter, θ, is (Devore, 2004). For estimating p—the proportion of successes in a population—the usual (i.e., classic) confidence interval is, in which the unbiased estimator of pis: the number of successes in a random sample of n observations (Devore, 2004). However, the use of this confidence interval is problematic because p and q are population parameters. To remedy this situation, are used inside the radical, which is justified under the Central Limit Theorem. According to Devore (2004), X has approximately a normal distribution for sufficiently large sample sizes. Since is essentially a constant times X, then follows an approximate normal distribution as well. Thus, the use of the classic confidence interval for p has been readily accepted, provide that the normal approximation of is justified by a large sample size.

A major problem with the aforementioned justification of using involves the use of the phrase “sufficiently large” with respect to sample size. Although it should not seem to be too big of anissue, the actual determination of whether or not a sample size is sufficiently large is not so straightforward. For example, Devore (2004) stated that the sample size is large if McClave and Sincich (2009) recommended both but many elementary textbooks declare that sample sizes of (i.e., Rule 1) or (i.e., Rule 2) are sufficiently large. This issue cannot be taken lightly, since the choice of sample size is a critical condition for using the classic confidence interval (McClave & Sincich, 2009). In addition, the sample size could impact the actual probability coverage of the confidence interval (Wilcox, 2003). In theory, the coverage should be 1 −α, but Wilcox (2003) stated that the actual coverage will be “reasonably close to 1 −α if n is not too small and p is not too close to zero or 1” (p. 135). Since the experimenter has no control over the value of care must be taken to ensure that the proper sample size is selected. It boils down to the question of how large does a sample size need to be to be considered “sufficiently large”.

The issue surrounding sufficiently large sample sizes could be remedied by using a different method of estimation. One method which has become very popular is the percentile bootstrap method. Wilcox (2003) provided a thorough description of this technique. First, a random sample of size n is taken from the population of interest. The so-called “bootstrap sample” is then obtained by randomly sampling, with replacement,n observations from the random sample. The mean of the bootstrap sample, denoted by is recorded. This process is repeated B number of times—typically, 1000 times—resulting in 1000 bootstrap sample means. The means are sortedin ascending order (i.e., from lowest to highest). The lower and upper bounds of the confidence interval are:

where and This interval, in theory, should contain the middle (1 −α) percent of the B bootstrap sample means. For example, if B = 1000, then the 25th and 975th bootstrap sample means correspond with L and U, respectively. Thus, 95% of the bootstrap sample means are contained between these two quantiles (Wilcox, 2003).

In theory, the probability coverage of the classic and bootstrap confidence intervals should be the same. However, what happens in practice does not always follow the theory. Thus, the purpose of the project was to compare the efficacy of these two techniques.

Results

A summary of the probability coverage of the classic and bootstrap 95% confidence intervals are contained in Table 1 (see page six). The rows entitled (a) “p” are the target (i.e., population) proportions; (b) “n” contains the sample sizes calculated in accordance with Rules 1 and 2 cited above; (c) “bootstrap hits” contains the number of bootstrap confidence intervals that contain the target proportion; (d) “bootstrap coverage” is the percentage of bootstrap confidence intervals that contain the target proportion; (e) “classic hits” contains the number of classic confidence intervals that contain the target proportion; and (f) “classic coverage” is the percentage of classic confidence intervals that contain the target proportion. Using α = .05, we would expect that 950 of each of the confidence intervals to contain the target proportion. Except for two instances, the actual coverage of the bootstrap confidence intervals corresponds with the theoretical coverage. The classic confidence intervals, on the other hand, fell short of the theoretical coverage in every case. However, coverage did seem to improve for the classic intervals for the larger sample sizes.

Table 2 on page six contains the actual coverage of the bootstrap and classic 99% confidence intervals. According to theory, 990 of the 1000 generated intervals should contain the target proportion. According to the generated data, neither of the two types had coverage that was in accordance with the theory. However, the bootstrap confidence intervals overall had a larger percentage of coverage.

p / 0.05 / 0.1 / 0.15 / 0.2 / 0.25
n / 100 / 50 / 33 / 25 / 20
bootstrap hits / 957 / 962 / 954 / 960 / 953
bootstrap coverage / 95.70% / 96.20% / 95.40% / 96.00% / 95.30%
classic hits / 872 / 887 / 881 / 884 / 883
classic coverage / 87.20% / 88.70% / 88.10% / 88.40% / 88.30%
n / 190 / 100 / 71 / 56 / 48
bootstrap hits / 957 / 953 / 942 / 942 / 951
bootstrap coverage / 95.70% / 95.30% / 94.20% / 94.20% / 95.10%
classic hits / 907 / 840 / 903 / 928 / 905
classic coverage / 90.70% / 84.00% / 90.30% / 92.80% / 90.50%

Table 1: Summary of 95% Confidence Intervals

.

p / 0.05 / 0.1 / 0.15 / 0.2 / 0.25
n / 100 / 50 / 33 / 25 / 20
bootstrap hits / 965 / 965 / 971 / 971 / 971
bootstrap coverage / 96.50% / 96.50% / 97.10% / 97.10% / 97.10%
classic hits / 959 / 959 / 972 / 947 / 961
classic coverage / 95.90% / 95.90% / 97.20% / 94.70% / 96.10%
n / 190 / 100 / 71 / 56 / 48
bootstrap hits / 986 / 970 / 982 / 978 / 982
bootstrap coverage / 98.60% / 97.00% / 98.20% / 97.80% / 98.20%
classic hits / 955 / 967 / 903 / 978 / 964
classic coverage / 95.50% / 96.70% / 90.30% / 97.80% / 96.40%

Table 2: Summary of 99% Confidence Intervals

References

Devore, J. L. (2004). Probability and statistics for engineering and the sciences(Sixth ed.). Belmont, CA: Brooks/Cole.

McClave, J. T., & Sincich, T. (2009). Statistics (Eleventh ed.).Upper Saddle River, NJ: Pearson Education, Inc.

Wilcox, R. R. (2003). Applying contemporary statistical techniques. San Diego, CA: Academic Press.