Revision of Preparing for the AP Statistics Exam

Revision of Preparing For The AP Statistics ExamChapter 2

Sec / Pg / Prob / Change / Answer Change
2 / 15 / MC1 / Answer (E)  It is bell-shaped.
change stem to “always” true?
I still wonder if can argue for both B and C? / None
15 / MC2 / Answer (E)  is effective for displaying center and spread but not for displaying shape.
In (A), change “can be” to “are generally” / None
MC3 / Could D also be considered a correct answer?
15 / MC5 / Answer (E)  It is possible to have a box plot with whiskers but no box. / None
MC 6 / Why the word “correct” in the stem?
4 / 25 / MC1 / Answer (D)  mound-shaped
Answer (E)  bimodal
Change stem to “could best be described” / None
Maybe rearrange order of questions and make it more clear shape is known but believe the numbers are as such?
25 / MC4 / New problem – if too long, drop MC6.MC6 is one of my personal favorites, but would change the distractors, e.g., 9, cannot be determined, …
A chemistry teacher announces that he always curves his tests according to the Empirical Rule with A’s and F’s outside 2 standard deviations of the mean, B’s and D’s between 1 and 2 standard deviations from the mean and C’s within 1 standard deviation of the mean. Which of the following is a valid statistical concern regarding this policy?
(A) Approximately 2.5% of students will merit A’s.
(B) Approximately 13.5% of students will merit B’s.
(C) Approximately 68% of students will merit C’s.
(D) This procedure only works when the mean of the raw scores is greater than the standard deviation of the raw scores.
(E) This procedure only works when the distribution of the raw grades is mound-shaped and symmetrical. / (E) The Empirical Rule only applies to mound-shaped and symmetrical distributions.
5 / 29 / MC3 / Change stem:
If a score of 75 is in the 30th percentile in a data set of 150 scores on a test, what is the percentile position of a score of 100? / None
29 / MC4 / Change question in the stem:
Which of the following is the best estimate of the number of points that he would have to improve his score on the second attempt in order to better his performance on the first test? / None
Assess / 33 / Add “Leaf = 1.0” to the graph in the stem.
33 / MC9 / New problem:
Which of the following is generally true about a highly skewed-right distribution?
(A)The median is less than the mean.
(B)The median is greater than the mean.
(C)The median is approximately equal to the mean.
(D)Whether the mean or median is larger depends on the degree of skewness in the distribution.
(E)Whether the mean or median is larger depends on the value of the standard deviation of the distribution. / (B) Since the median is resistant to extreme values, and the mean is not, the mean tends to move toward extreme values in the distribution.

p. 8, paragraph 2, do you want to add “exploration of univariate quantitative data?

p. 8, paragraph 3, I wonder if the example could be fleshed out a bit to be clear that the “individuals” are your children and the variable is “age of the child” and then the possible values for that variable is {…} – do these values need to be updated?

p. 8, paragraph 4, does the “class of 2003” need to be updated?

p. 8, paragraph 6, I would drop “number of males or females” as a variable (as it is not a “per student” variable)

p. 8, end paragraph 7, want to give a “since,” e.g., since the results can take in value between about xx and xx inches or something?

p. 8, paragraph 8, how about “or categorical data are data whose outcomes describe a characteristic of the individuals”?

p. 9, just before exercises, want to say “are restricted to the formal study of only…”

p. 9, review exercises, is “ages” clearly discrete or continuous? I would have thought “number of years” and called it discrete…

I would drop or reword question 4 since “number of teachers” is not the variable. Similarly in number 5 it’s not clear what you are treating as the variable… Students so often confuse the numerical summary of the data with the variable I would like not to give them any encouragement in doing so…

p. 10, example 1, the context seems a little odd, why do they have so many books? Could say something like number of cell phone calls they received in the last 24 hours, but I have no idea what that distribution would look like.

p. 10, bottom, a little strange that the intervals are not all the same width? Do you need the “tally” column again?

p. 11, I would consider dropping the note about 1 dot = 5 occurrences. I’m now of the frame of mind that if your data set is that large do a histogram instead.

p. 11, example 3, a little strange that this grouping doesn’t match your earlier table and also that it doesn’t start at zero. I would include a note here that you choose the grouping to aim for 5-15 intervals.

p. 11, paragraph below example 3 box, I’m a little worried here that we don’t force them to make the interval widths equal before we tell them to set the height equal to the frequency (meaning if the interval widths are not equal than this is the wrong thing to do)…

Maybe say “consistency in applying the decision for the entire data set is the key.”

p. 12, example 4, how-to tip #2, do you mean “off” instead of “of” before “the appropriate stem”?

how to tip #3, If the original data are not rounded”?

stem with 2 subs how to tip #2, maybe “first row or “top row (integer)” instead of just “integer row”?

stem with 5 subs how to tip #3, if data are not ordered?

p. 14, Can a horizontal axis label be added to these graphs, even if its just data or variable?

p. 14, advantage number 2, wonder if the size of the data set would be a better distinction?

p. 16, characteristic 5, I wonder if it could be “Unusual observations and outliers”?

p. 16-17, all of these graphs need horizontal labels/titles.

Example 1, should clusters be one?

Example 3, do you want to say possibly symmetric. I found the materials a little inconsistent in whether or not they wanted me to say something about shape from boxplot

Example 4, should clusters be one?

I would add titles for the questions on p. 17-18 as well, again even if it is just data or variable

p. 18, para 1, change “will” to “can” before “answer the following four questions”?

bullet 3, position “of a particular data value”?

p. 19, a little strange that you don’t refer to the calculation of one mean before jumping into weighted average. I wonder if the weighted average discussion could come later?

p. 19, example 2, could change “median is the 4.5th value” to “median is between the 4th and 5th data values

p. 20, I found this discussion a little unclear. I know your point but wouldn’t a student say look, here’s a skewed left distribution, where mean < median doesn’t hold? Say maybe just remember its only a tendency and should be especially cautious jumping to a conclusion without looking at the data. I also wouldn’t encourage the 1 dot = 5 values here.

p. 20, right before example 4, change “greatest” to “largest”?

p. 21, before example 7, how about “Any data value that is greater than 1.5IQRs above Q3 or less than 1.5IQRs below Q1 is considered an outlier.” (To make it more clear that can be more than one?)

p. 22, note 2, change to “of the mean of the squared deviations … estimator of the spread of the sample”

p. 23, include labels? E.g., 9.643 cars, 7.781 cars and also considering showing one more step in the variable calculation, 1/(14-1) (642 + 642 + … +2892)

p. 23, are there supposed to be numbers on this graph?

p. 24, middle, should you say “standard deviation determines the spread as follows…” more on spread, and les son shape

button: a little strange to only list FNS and not say anything about IQR since you said you were giving information about spread?

p. 25, Q6, want “was severely skewed right”?

p. 26. I’m afraid I have a bit question here. We get pretty sensitive to students saying “in the IQR” since the IQR is one number. So I wonder if we shouldn’t say “in the third quartile”. I’m not sure what the standard terminology for “locating between the 2nd and 3rd quartiles” would be and you do refer to it a lot. Often an easy fix is to say “above” or “below” instead. I might also encourage you not to say “in the 70th percentile” but to say “at” or “equal to the 70th percentile”? This issue comes up a lot and I won’t list them all here but have noted them on my copy.

p. 30, Are there problems with Q9? I would say A is false, B is false, C is false, D is true, and E is false.

I didn’t find any notes about this problem on our errata page.

p. 30, Q5, do we need to say anything about how the summary statistics won’t change as we change this one value? I also had some trouble with this one, getting the answers in the back.

p. 31, add horizontal labels to al of the graphs

p. 32, Q1, I wasn’t sure what you meant by “selections”

p. 33, Q2, consider 53 as a distractor?

Section 2.6 solutions

Q1, want to discuss shape at all since it was explicitly asked for? Also include units with the numerical summaries? (also for example in solution to p. 34 investigation tasks)