Slide 1

Statistical Sampling Part III–Confidence Intervals

Slide 2

This video is designed to accompany pages 41-76 of the workbook “Making Sense of Uncertainty: Activities for Teaching Statistical Reasoning,” a publication ofthe Van-Griner Publishing Company

Slide 3

Let’s take a look at a CBS News/New York Times poll concerning gun laws. Here is what the article says:

As the president outlined sweeping new proposals aimed to reduce gun violence, a new CBS News/New York Times poll found that Americans back the central components of the president's proposals, including background checks, a national gun sale database, limits on high capacity magazines and a ban on semi-automatic weapons. Asked if they generally back stricter gun laws, more than half of respondents - 54 percent - support stricter gun laws …. That is a jump from April - before the Newtown and Aurora shootings - when only 39 percent backed stricter gun laws but about the same as ten years ago.

(Emphasize the underlined part)

This poll was conducted by telephone from January 11-15, 2013 among 1,110 adults nationwide. Phone numbers were dialed from samples of both standard land-line and cell phones. The error due to sampling for results based on the entire sample could be plus or minus three percentage points.

In what sense does this statement about sampling error make us better informed?

Slide 4

Our goal is to develop a mature understanding of what sampling error is. To do this, we will need to become familiar with some of the mathematical theory on which statistical science is built.

“Sampling error” or “error due to sampling” is also commonly referred to as “margin of error.” You may well have heard about margin of error a long time ago, but even if you did, there are likely things here for you to learn. Indeed, the well-known polling organization known as Harris Polls claimed in a November 2007 article on the BusinessWire that “'Margin of Error'', When Used by Pollsters, Is Widely Misunderstood and Confuses Most People .”

Those are pretty strong words coming from a polling organization. The fact is, however, a solid understanding of margin of error requires us to embark on a journey that is, in many ways, much more subtle than most people suspect.

Slide 5

As the King said to the White Rabbit: “Begin at the beginning and go on till you come to the end: then stop.”

For our beginning, we need to acknowledge a fundamental problem. Another, equally well-chosen sample of size 1,110 adult Americans, asked the same question, would almost surely not yield the same sample percentage (54%) who favor stricter gun laws.

Nor do we know if that would even be a good thing since we have no way of knowing for sure if the true proportion of all adult Americans who favor stricter gun laws – the parameter of interest – is even close to 0.54.

Slide 6

The variability seen in a statistic from sample to sample is called “sampling variability.”

We might ask, if a sample statistic is going to vary from sample to sample, then which one is the correct one? As long as every one of the samples were taken correctly – as probabilistic samples – then the answer is that all the statistics are correct.

A better question would be “how can we do a decent job of even estimating the parameter of interest in the face of this kind of variability?”

The answer to this last question is the essence of elementary statistical inference.

Slide 7

Meaningful, useful parameter estimation depends largely on the following fact: with SRS-like samples, the variability in a statistic from sample to sample is predictable, in some senses “known in advance” for all such samples. Think about this some. It is a mathematical statement in disguise, but a critical statement nonetheless.

With samples that are not SRS-like – really, not probabilistic samples – such as convenience samples, then sampling variability is not predictable and, as such, even elementary parameter estimation is hard to do with any integrity.

Slide 8

In what sense is the sampling variability that emerges from SRS-like samples predictable? Understanding the answer depends on you remembering what a histogram is. If you don’t recall, you may want to pause this video and do a quick web search to remind yourself.

Here is what we know. Suppose we were to take many different samples of size n, and for each sample record the statistic of interest, say the proportion in the sample who said “yes” when asked “Are you in favor of same-sex marriages?” If we were to plot all of those sample proportions in a simple histogram then we would be guaranteed the following. The histogram would be bell-shaped and the peak of the histogram would be above the parameter from the population.

Think about this. It is much more subtle than it often appears the first time it is encountered. T he upshot is not a recommendation in any sense to take many different samples of size n. However, it is a powerful piece of mathematical insight. It says, if we were (emphasized) to take many different samples of size n and look at the sampling variability by way of a simple histogram, then the variability would not be erratic. It would follow the shape of a bell, and, moreover the peak of the bell would be above the population parameter.

This histogram is called a “sampling distribution.”

Slide 9

Let’s have another look at sampling distributions. Suppose you took an SRS of 80 people and asked “Are you in favor of same-sex marriages?” and record the proportion who said “yes.” Suppose, also that you took 24 more samples of size 80 and did the same thing. In the end you will have 25 sample proportions.

A histogram of those 25 proportions would look something like this. It would be bell-shaped and peak above the parameter in the population. In this case, the parameter appears to be about 2/3.

A word of caution is in order. It is true that constructing a sampling distribution would be a way of getting a handle on what the parameter really is. But this is never how the parameter is actually estimated. Rather, what statistical inference does is to take advantage of the knowledge that the sampling distribution has to look this way, if one were to take the time and undergo the expense to construct it, and uses that knowledge to make meaningful statements about the parameter just based on the one sample of size n that was taken. That’s what is awesome about elementary inference.

Slide 10

What kind of mathematical knowledge comes from knowing that a sampling distribution has a particular bell shape that peaks above the parameter? Let’s answer that by speaking specifically to the sampling distribution of a sample proportion. We will use the letter “p” to denote the true, unknown population proportion of interest, and to denote a prototypical sample proportion.

Mathematics guarantees us that p will be right under the peak of the bell curve. Moreover one can comment on how likely it is that a particular will be in any particular range of possible values of on the horizontal axis, under the bell. A few such ranges are easy to describe. For example: 95% of all values based on a simple random sample of size n will fall between the values of and .. 68% will fall betweenand . And 99.7% of all values will fall between and .

This may be a little hard to appreciate at first glance because it may be unclear how to operationalize the knowledge. Notice, however, that these very specific probability statements can be made regardless of what p is. You only needed to know n and that your sample was SRS-like.

For example, if n is 100 then is 0.10. So regardless of what p is in the application at hand, you know in advance that about 95% of all possible sample proportions you could observe in samples of size n, have to be within 0.10 of p on either side of p.

This is profound but still has to be operationalized into the idea of a confidence interval.

Slide 11

The well-known margin of error can now be more carefully defined. If you are estimating a proportion p, using the sample proportion from an SRS-like sample, then a 95% margin of error is roughly . . The “95%” is often suppressed in the media and just the words “margin of error” are used. A 99.7% margin of error would be .

What is the role of this percentage? The MOE is operationalized in what is called a “confidence interval” and the percentage being used is the “confidence.” For example, a 95% confidence interval is a range in which you have 95% confidence of having captured p. The range is formed by constructing plus or minus the 95% margin of error, or .

The correct interpretation of such an interval is very important to us and we will pay close attention to that below. We will also make sure we understand better how this interval comes from the sampling distribution. But first let’s practice constructing one.

Slide 12

Recall the survey on gun laws. The survey was of 1,110 adults and reported a sampling error of +/- 3 percentage points. Where did this come from?

(Hit Return)

Since the sample size is 1,110 and the (unsaid) confidence is 95% then we should be able to reproduce the 3% by forming 1 divided by the square root of 1,110. Notice this works. In fact, a 95% confidence interval would be 0.54 plus or minus 0.03.

Slide 13

A confidence interval for a population proportion based on a sample proportion from an SRS-like sample can be computed for any level of confidence. Here is the formula for eight common levels. To use this, just identify the confidence level (say 70%) and then take the corresponding z* value (say 1.04), divide it by 2 and multiply it by 1 over the square root of n. This will be the margin of error corresponding to that confidence level. From there the confidence interval is easy to form as phat plus or minus the margin of error.

Slide 14

Remember, if the sample is not an SRS-like sample, then the formulas we’ve seen for the margin of error and corresponding confidence intervals don’t apply. If they are used regardless then you have created an assessment of where p might be that is not only meaningless, but deceptively so.

Slide 15

Finally, it is important to understand the correct interpretation of a confidence interval. It is common to say that a 95% confidence interval is a range in which there is a 95% chance that the true,unknown parameter p will be. That’s not quite the case and we don’t want to settle for an interpretation that loose.

To access the correct interpretation we need to revisit our bell curve. This sampling distribution describes where the sample proportion phat is likely to fall. Now think of phat as being the center of an interval whose width is two times a specified margin of error. This is precisely what a confidence interval is (for example: ). But if you think of this center, the phat, as being the random part of the interval, then the sampling distribution tells you how likely you are to have an interval like this one cover the parameter p. That is, as long as phat is not more than 1 over the square root of n on either side of p, then the interval will be wide enough to cover p. But the sampling distribution tells you that 95% of all phats will be that close to p. Similarly for all other levels of confidence.

Slide 16

So what is the correct interpretation of a confidence interval? Have a look at the animated graphic shown here.

(Hit return twice)

Let’s suppose we are taking SRS-like samples from a population where, unbeknownst to us, 79% of the population would say “Yes” to the question “Are you in favor of gay marriage?” The parameter is indicated by the dashed vertical line segment. The horizontal line segment is just a part of the number line showing the values of possible sample proportions.

(Return)

Let’s suppose you take a sample of size n, ask the question above, and find that about 90% in that sample say “Yes.”

(Return)

Now suppose you compute the corresponding margin of error and create a simple 95% confidence interval around that sample value. Notice that the interval does not contain the parameter.

(Return)

What about other samples of size n? They might all yield different phat values.

(Return)

And as such they might all yield distinct confidence intervals, though all the intervals will be the same width of course. Notice that some of those intervals contain the parameter and some don’t.

(Return)

The fact is, if the intervals were 95% confidence intervals, then about 95% of the samples of size n would produce intervals that would contain the parameter. Think about this. This is true even when you (obviously) don’t know the parameter. What’s tricky is that you don’t know if the interval you have computed or are reading from the news clip contains the parameter or not. But in the careful sense mentioned above you can quantify the chance that it does.

So the correct interpretation of a K% confidence interval based on a sample of size n is as follows: about K% of all samples of size n will produce confidence intervals that contain the parameter.

Slide 17

This concludes our video on confidence intervals. Remember, simple formulas are available for the margin of error and associated confidence intervals, provided the data were collected in a simple random sample, or similarly statistically correct fashion.