Stat 100 Project 5

Purpose:To explore confidence intervals for small samples, both when the standard deviation is known and when it must be estimated from the data, and to produce a confidence interval and to perform a hypothesis test for comparing two population means.

Reading: Text; section 8.4 for large sample hypothesis tests, section 9.3 for small sample hypothesis tests, and section 10.2 for confidence intervals and hypothesis tests for comparing two population means.

Turn in:Printouts of plots of the confidence intervals and the results of the hypothesis test, as well as the answers to the questions at the end of this assignment.

Instructions: Open the spreadsheet associated with this project and select the “Confidence Intervals” tab. We have setup two table lookups in the upper left-hand corner: one for the t-distribution and one for z-distribution. Both are two-sided. Below this we have setup ten columns of random data. This data is drawn from a standard normal population, and is computed by: “=NORMSINV(0.9999*RAND()+0.00005)” where the 0.9999 and the 0.00005 are included to work around a bug in Calc.

To the right, we have set up the sample mean and standard deviation of each row – each row is a trial. To the right of that, we have computed the error margins: the first column assumes we have calculated the standard deviation from the samples and comes from the t-distribution. The second column assumes that we know the standard deviation of the population (it is one) and uses the z-distribution. The population mean is given in the next column.

The two plots depict the confidence intervals for 40 trials under the two assumptions. Notice the differences between the charts as you vary the confidence (in the green cell, a2) from 0.1, 0.4, 0.8, 0.9, 0.95, 0.99, and 0.999. Notice two things: the t-distribution confidence intervals are generally longer, especially for the higher confidence levels. Secondly, the lengths of the t-distribution confidence intervals vary, while the z-distribution intervals do not.

We have setup 1,000 trials for this experiment and counted the number of times that the confidence intervals cover the population mean. The average success rate is provided in the yellow table. Notice how the success rate is quite close to the specified confidence.

Now click on cell F7 (the “6”) and holding the shift and ctrl key, hit the down arrow and then the right arrow. This should highlight columns 6 through 10. Delete these columns. Now experiment changing to different confidence levels as before. What do you notice about the charts?

Switch to the “Output” tab and enter your name on the first chart.

Next, we will compute a confidence interval and perform a hypothesis test for comparing two population means. Refer to problem 17 of chapter 10 (page 405). We will work directly in the Output worksheet. (Calc users should remember to type a semicolon instead of a comma in the commands below.)

Type the title “Female” in the cell a17 and enter the 8 corresponding observations in the A column below. Type the title “Male” in cell b17 and enter the 11 corresponding observations in the B column.

Type “=average(a18:a28)” in cell a30. Copy this sell and past it into cell b30. Type “Mean” in cell c30.

Type “=stdev(a18:a28)” in cell a31. Copy this sell and past it into cell b31. Type “Standard Deviation” in cell c31.

Type “=count(a18:a28)” in cell a32. Copy this sell and past it into cell b32. Type “N” in cell c32.

Type “=b30-a30” in cell d18 and “x_bar – y_bar” in cell e18.

Type “=SQRT(((A32-1)*A31^2+(B32-1)*B31^2)/(A32+B32-2))” in cell d19 and “Pooled Standard Deviation” in cell e19. (You can copy and paste these if you wish.)

Type “=A32+B32-2” into cell d20 and “Degrees of Freedom” into cell e20.

Type “=TINV(0.05,D20)” into cell d22 and “Value from t-table, 95% two-sided” in e22.

Type “=D22*D19*SQRT(1/A32+1/B32)” into cell d23 and “error margin” in e23.

Type “=D18-D23” in cell d24, “=D23+D18” in cell e24 and “95% CI” in f24.

Type “=D18/(D19*SQRT(1/A32+1/B32))” in cell d26 and “computed t-value” in e26.

Hopefully, you will recognize formulas from the text in the three commands above.

Both the confidence interval and the computed t-value tell us the conclusion for this test, but Excel and Calc also have a simple way to compute the p-value. Type “=TTEST(A18:A25,B18:B28,2,2)” in cell d28 and “p-value” in e28.

Questions: Answer precisely and concisely the following.

1. Describe your observations regarding the differences in the confidence intervals between the z-test and the t-test as seen from the graphs.

2. What was the cause of these differences?

3. Write the 95% confidence interval, the p-value, and state the conclusion from the hypothesis test concerning the weights of wolves.