Possible answers to 2011 AP Statistics Exam Free Response
1.
a. If the distribution of 40-yd times are indeed normal, then we’d predict that about 2.5% of all times, in the long run, to be faster than = 4.3 seconds. However, our fastest time wasn’t even close to this: it was 4.4 seconds. I have serious doubts that a normal model is appropriate.
b. This player’s lift is 2.4 standard deviations heavier than the mean lift for all players.
c. Here are the z-scores for each player’s performance:
Player A / Player BSpeed
Better = faster / à
1.2 SD’s faster than average. /
0.2 SD’s faster than average.
Strength
Better = heavier weights /
2.4 SD’s more lifted than average. /
2.6 SD’s more lifted than average
Total # sd’s in the “good direction” / Total sum 1.2 + 2.4 = 3.6 SD’s “better than average” / Sum: 2.6+0.2= 2.8 SD’s “Better than average”
So I Pick player A. His sum total is higher.
2.
a. 48/200 = 0.24
b. P(Party Y) = 168/500 = .336, which differs from 0.24. Therefore the two events are not independent.
c. If gender and Party are independent in Lawrence Township then the distribution of party for males and females should be identical. Those proportions can be found in the
table of expected counts below, using the sample size from Franklin Township as an example.
a) There are nine floors. I will use a random number generator to select two different integers from 1 to 9. Those floors are selected, and every apartment on those two floors will get new carpeting installed. This gives a sample size of 8 apartments.
b) By using the stratified sample shown, we will guarantee that 25% of our sample is comprised of apartments with kids, and 75% of our sample is apartments with no kids. Just like in the population of apartments. Thus, any statistics calculated from these samples will have less variability than the statistics calculated from cluster samples.
Furthermore, in this particular situation, the stratified sample may be preferable to the cluster sample because it guarantees apartments with kids. If we choose the cluster sample, it’s possible that we select two floors with no “kiddie” apartments (like 3rd floor and 4th floor). Then any statistic about carpet wear we record might not be able to give us an estimate of carpet wear for apartments with kids.
4. Setup:
. Where
= mean reduction in cholesterol for all middle aged males with high cholesterol who take a placebo
= mean reduction in cholesterol for all middle aged males with high cholesterol who take the cholesterol drug
Conditions check for a two-sample T test for independent means:
Do we have independent groups? Yes. We have random assignment of each subject into only one of the two treatment groups. So results in group A are not paired with group B.
Are samples randomly selected ? Yes, stated in the problem.
Is the underlying sampling distribution of normal? Sample sizes are low, so we must check dot plots or normal probability plots for severe deviations from a normal model.
I observe no severe skewness/ outliers.
Also, the normal probability plots for each group look roughly linear - they show no obvious departures from linearity. So
So I’m convinced that is roughly normal, and that a two-sample t-test can be run.
Conclusion: Our p-value is greater than 0.01. By this criterion, fail to reject Ho. We fail to find convincing evidence at the 1% level, that the drug+exercise+diet produces a reduction in mean cholesterol levels greater than the reduction in mean cholesterol levels produced by the placebo+exercise+diet.
a.
b. I will use the slope, and multiply by 15: .
c. 87.3% . r-squared.
d. Yes. The test to reject in favor of shows a p value of 0.000. So we reject the null, and conclude the slope of the least-squares linear model for predicting electricity production from wind speeds at this windmill for all days of its production is not zero. This implies an association between wind velocity and production for all days this windmill is in production.
a. Setup: 95% CI for p = the proportion of all US 12th graders who can answer the question correctly.
Conditions:
Do we have a random sample of all 12th graders in US? Yes, stated.
Is the sampling distribution of normal? Since and are both >10, so is roughly normal.
CI: . We estimate, with99% confidence that between 26.82% and 29.18% of all US 12th graders could answer the question correctly.
b. The tree diagram is shown below:
c. P(guessing correctly) =
d. Observe that p from part (a) is the same as = P(guessing correctly). Since , we solve for k. In other words, . In this context, we estimated p with. So our estimate for k would be. This statistic has a standard error equal to 4/3 the standard error of. As a result, the width of our CI for k will be 4/3 of the width of our CI for p. So my 99% confidence interval for k is. Conclusion: We’re 99% confident that between -1.57% (or here, 0%) and 9.57% of all US 12th graders actually know the answer to the question.