SOLUTIONS LAB ACTIVITY 9
Activity 9.1 The term sampling frame refers to the group that actually had a chance to get into the sample. Ideally, this is the same as the population of interest, but sometimes it isn’t. In the following situation, describe the population, the sampling frame, the sample, the parameter of interest, and the statistic.
A Gallup Poll is done using random digit dialing to reach individuals in households with land-line telephones. The purpose is to estimate the proportion of U.S. adults who favor stronger gun control laws. One-thousand persons are sampled, and 63% favor stronger gun control.
a. Population = all U.S. adults
b. Sampling frame = adults in the households with land-line telephones
c. Parameter = proportion of U.S. adults favoring stronger gun control
d. Sample = the 1,000 surveyed
e. Statistic =63%, the sample percent
Activity 9.2 Access the Class Survey dataset in the Datasets folder of the course web site. Clicking the link will open Minitab.
a. Use Minitab to tally the numbers and percents of students who do and do not smoke cigarettes regularly. (Stat>Tables>Tally) The variable name is Smoke Cigarettes (C10) and are student responses to if they smoke cigarettes.
Smoke
Cigarettes Count Percent
No 209 92.48
Yes 17 7.52
N= 226
b. Use Minitab to determine a 95% confidence interval for the proportion of all PSU students who regularly smoke. (Stat>Basic Statistics>1-proportion, enter Smoke as the variable AND click Options and then check the box to use methods based on the normal curve.)
Write the confidence interval and then write a sentence that interprets the interval.
Variable X N Sample p 95% CI Z-Value P-Value
Smoke Cigarettes 17 226 0.075221 (0.040835, 0.109607) -12.77 0.000
The 95% confidence interval is 4.08% to 10.96%. We are 95% confident that the true population of PSU students who regularly smoke is between 4.08 to 10.96%.
c. In December 2005, the U.S. Center for Disease Control estimated that 23.6% of Americans between the ages of 18 – 24 smoke cigarettes. Based on the interval computed in part (b), do you think 24% of PSU students smoke? Explain.
No, we do not believe that 24% of PSU students smoke. We reach this conclusion since our 95% confidence interval does not contain the value of 24%.
d. Use Minitab to determine a 90% confidence interval for the proportion of PSU students that smokes cigarettes. (Use the same Minitab method as for part (a) AND click the Options button to access a box where you can change the confidence level to 90.) Write the interval.
Variable X N Sample p 90% CI Z-Value P-Value
Smoke Cigarettes 17 226 0.075221 (0.046364, 0.104079) -12.77 0.000
The 90% confidence interval is from 4.6% to 10.4%.
e. In general what is the relationship between the confidence level and the width of an interval?
As the level of confidence increases the interval gets wider. This makes sense since a wider interval allows for the capture of more possible values, thus making us “more confident”.
f. Show how to calculate the 90% confidence interval determined in part (d) “by hand.” Use either the lecture notes or Example 10.6 on page 340 of the book for guidance.
p-hat ± M*sqrt[(p-hat – (1 – p-hat)]/n) = 0.075 ± 1.645*sqrt((0.075*0.925)/226) = 0.075 ± 0.029 =
0.046 to 0.104, or 4.6% to 10.4%.
Activity 9.3 Continue to use the Class Survey data set. The variable named Try Weed(C22) contains responses to a question about whether students have tried marijuana.
a. Use Stat>Tables>Cross Tabulation and Chi-square to help you fill in a two-way table for the relationship between the variables Smoke Cigarettes and Try Weed with counts.
Smokes?Weed? / No / Yes / Total
No / 107 / 1 / 108
Yes / 102 / 16 / 118
Total / 209 / 17 / 226
b. Use Minitab to determine a 95% confidence interval for the proportion that tried marijuana in a population of students who smoke. TO DO THIS, again use Stat>Basic Statistics>1-Proportion BUT NOW click on Summarized Data and enter the sample size for number who smoke as “Number of Trials” and enter the number tried marijuana as “Number of events.” You might have to use the Options button to change the confidence back to 95.
What is the sample proportion that tried marijuana for the Smokes group? 16/17 = 94%
What is the 95% confidence interval? Enter 17 as the number of trials and 16 as the number of events. The 95% confidence interval is 0.829327, 1.000000 or 82.9% to 100.0%.
c. Determine separate 95% confidence intervals for the proportion that tried marijuana in a population of students who do not smoke. You can use Minitab.
CI for Do Not Smoke: Enter 209 as the number of trials and 102 as the number of events. This produces a 95% confidence interval of 0.420271, 0.555806 or 42.03% to 55.58%.
d. Use the confidence intervals found for those that smoke (part b) and do not smoke (part c) to make a generalization about the student smoking population with regard to the proportion that tried marijuana.
Since the two intervals do not overlap and the interval for the smoking group (part b) is greater than the non-smoking group, we can conclude that there is a relationship between smoking cigarettes and trying weed in the population of PSU students.
e. Which of the two intervals is the narrowest (in terms of difference between lower and upper values)? Why do you think this particular interval is the narrower of the two?
The interval which is narrower is for the non-smoking group in part c. This is primarily a result of the much larger sample size of 209 for non-smokers compared to 17 for smokers.
f. Show to calculate “by hand” the 95% confidence interval found in part (b) for the smokers.
p-hat ± M*sqrt[(p-hat – (1 – p-hat))/n]= 0.94 ± 1.96*((0.94*0.06)/17) = 0.94± 0.113 =
0.827 to 1.053, or 82.7% to 105.3%. (Minitab will round this to the max of 100%)
Activity 9.4 To get a better understanding of the Central Limit Theorem as discussed in the lecture notes you can visit and review a simulation program at:
http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html