In Class Quiz: An Excel – Based Interpretation of Confidence
By now, we’ve spent a great deal of time dealing with the ins and outs of confidence intervals (CI). Hopefully, you’ve begun to see just how important they can be (and, amazingly, how the news media, time and time again, ignores them).
Perhaps, though, the notion of “confidence” has eluded you a tad. For example, when we say we’re “95% confident”...what exactly does that mean? That’s what this quiz aims to do: shed light on the idea of confidence in a very concrete way, using Excel.
In this, we will approximate the average birth weight of an American baby (ABWAB), using the ideas of CIs. Please start by opening the spreadsheet “weights.xlxs” (the link next to the link from which you opened this page). You’re viewing the ABWABs of 20 randomly selected American newborns[1], along with a histogram of that sample. Let’s start by getting that mean and standard deviation filled in! =AVERAGE(B4:U4) in cell D8 and =STDEV(B4:U4) in cell D9.
Now – press key a few times. Each time, you get a new “sample of 20 babies”. You see how their weights change each time? The histogram, and s all change, as well.
From these data, I’d like you to construct some 95% CIs for the ABWAB, using Excel. We’ll use the “confidence” command to get the MOE of our sample of size 20. In cell D11, type this in:
- The first value, 0.05 is called the significance. It’s the decimal complement of the confidence, which is 95%.
- The second value is a cell reference to the sample standard deviation, which you placed in cell D9.
- The third number is the sample size.
Excel only needs these three values for a MOE calculation; as you might recall from class (or you might not. ), the formula is t .
Now that you have a mean and a MOE, you can construct the UCL and LCL’s! =D8-D11 in cell D12, and =D8+D11in cell D13. Cells D12 and D13 are now your 95% CI endpoints! Cool, huh?
Go ahead and press a few more times and see how everything changes. When you’re bored of doing that, I want you to write down some 95% CI’s!
- (8 points) Give metwenty 95% CI’s for the ABWAB. Press at least once after constructing each CI so that you get new baby data each time. Round values to the nearest hundredth’s place. Grab a piece of scrap paper to do this on.
OK...so you have 20 CI’s. Great! Now, we need to drive home what the term “95% confident” really means...
…it is “known” thatthe ABWAB is 7 pounds, 11 ounces. Take a look at your CI’s:
- (2 points) What percent of the CI’s that you constructed contain the value = 7.67?
(wait’ll you see what happens when we collate our data. )
It gets better...please click on the tab “Randomized /Results” (it’s down near the bottom of the workbook). Here, I’ve created 20 CI’s (identical, in construction, to the 20 you just created). However, instead of giving you the values of the endpoints, I’ve done the CI’s graphically.
You’ll also see a vertical line through 7.67; this line represents the value of . Do you see how many of
These CI’s contain ? Go ahead and press a few more times...when I did that, I got the frequency distribution at right. Check this out:- The mode of these values? 95%.
- The median of these values? 95%.
- The mean of these values? 95.5%.
80 / 1
90 / 3
95 / 8
100 / 8
Wowsers! This means that, if we repeatedly construct CI’s around randomly selected data sets, we can expect around 95% of them to contain the true population mean.
And that, my friends, is the definition of confidence!
One last point that you can ignore if you like: the calculation we did assumes a normal distribution. Now, I kind of forced that by how I constructed the data, but, if you were looking at the histograms, you may have noticed that, in a few, the data didn’t look normal. What amazes me is how well, even without the normal assumption, the t – interval captures the mean[2].
[1] Actually, what you’re looking at is a sample of 20 numbers I generated using the “known” values of and for the weights of newborns. By “known”, I mean that, after hundreds of millions of births, doctors have pretty much figured that they have the parameter values of and . I used Excel’s random number generator (and something called the Box – Muller normal approximation) to allow for natural fluctuations from baby to baby; every time you press ENTER, the randomizer will kick in, and you’ll get 20 new baby weights. If you’re interested, click in any of the weight cells to see the function.
[2] Of course, you have to use the entire CI...where the news media often screw up is simply reporting the center. Click on “Why Point Estimates Don’t work” to see why...when I refresh the data, I get 0%, 0%, 0%, 0%, 0%, 5%, 0%, 0%, 0%, 0%, 0%, 0%, 0%, 0%, 0%, 0%, 0%, 0%, 0% and 0% containing the mean. Way to go, CNN.