Review Problems for the TEST
I. Review of Basic Terminology
For each of the following problems, determine the type of variable(s) (discrete, continuous, nominal or ordinal):
1. The birth weight, date of birth, and the mother’s race were recorded for each of 65 babies.
2. Case numbers on files of manatees killed by boats.
3. Social security numbers used to identify subjects in a clinical trial.
4. In a study of birds in Buldir Island, Alaska, 312 breeding adult red-legged kittiwakes are banded.
5. A sample of broad-billed hummingbirds is observed and their mean length is 3.25in.
7. Ratings of excellent, good, average, or poor for the racing ability of thoroughbred horses.
8. A physician measured the height and weight of each of 37 children.
9. Blood types A, B, AB, and O.
10. During a blood drive, a blood bank offered to check the cholesterol of anyone who donated blood. For each person, the blood type and cholesterol levels were recorded.
In the following exercises, determine whether the given value is a statistic or a parameter:
11. In studying the effects of geese near an airport, a random sample of Canada geese includes 12 males.
12. A sample of Canada geese who mate is observed and the average number of offspring is 3.8.
13. A study involves attaching altimeters to individual frigate birds, and their average altitude is found to be 226m.
14. In a study of all cloned sheep it is found that their average age is 2.7 years.
In the following exercises, identify whether the statement refers to a sample, a population or both.
15. A marine biologist obtains the weights of rainbow trout that she catches in a net.
16. A Florida newspaper runs a health survey and obtains 750 responses.
17. In a Gallup poll of 1059 randomly selected adults, 39% answered “yes” when asked “Do you have a gun in your home?”
18. A graduate student conducts a research project about how adult Americans communicate. She begins with a survey mailed to 500 of the adults she knows. She asks them to mail back a response to this question: ”Do you prefer to use e-mail or snail mail (the USPS)?” She gets back 65 responses, with 42 of them indicating a preference for snail mail.
II. Graphical Interpretation of Quantitative Data
1. For each of the following histograms, estimate the mean, the median and the standard deviation.
(a) (b)
(c) (d)
2. In a study of schizophrenia, researchers measured the activity of the enzyme monoamine oxidase (MAO) in the blood platelets of 18 patients. The results were as follows (data listed in increasing order):
4.1 5.2 6.8 7.3 7.4 7.8
7.8 8.4 8.7 9.7 9.9 10.6
10.7 11.9 12.7 14.2 14.5 18.8
Construct a boxplot of the data.
3. A veterinary anatomist investigated the spatial arrangement of the nerve cells in the intestine of a pony. He counted the number of nerve cells in each 23 randomly selected sections. The counts were as follows (data listed in increasing order):
12 16 17 19 19 22
23 23 23 26 27 28
28 28 29 30 31 33
33 34 35 35 52
Construct a boxplot of the data.
III. Descriptive Statistics
1. The following set of data represents the ages of students in a small seminar: 20, 21, 22, 25, 26, 27, and 68. Select the most appropriate measure of central tendency for the data described.
a. Mean
b. Median
c. Mode
2. The following set of data represents the height for 10 girls, age 7 (given in cm): 45, 46, 47, 47, 48, 48, 48, 48, 49, and 50. Select the most appropriate measure of central tendency for the data described.
a. Mean
b. Median
c. Mode
3. The following set of data represents the temperature high for seven consecutive days in February in Chicago: 14, 22, 26, 27, 35, 38, and 41. Select the most appropriate measure of central tendency for the data described.
a. Mean
b. Median
c. Mode
4. Which of the following is the weakest measure of dispersion?
a. Range
b. Standard deviation
c. Variance
5. The blood alcohol concentrations of a sample of drivers involved in fatal crashes and then convicted with jail sentences are given below (based on data from the US Department of justice). When a state wages a campaign to reduce drunk driving, is the campaign intended to lower the standard deviation?
0.27 0.17 0.17 0.16 0.13 0.24 0.29 0.24
0.14 0.16 0.12 0.16 0.21 0.17 0.18
6. As part of the National Health Examination, the body mass index (BMI) is measured for a random sample of women. Some of the values are listed below. Based on these sample data, would a BMI of 34.0 be considered “unusual”? Why or why not?
19.6 23.8 19.6 29.1 25.2 21.4 22.0 27.5
33.5 20.6 29.9 17.7 24.0 28.9 37.7
7. Estimate the standard deviation of ages of all students at your college.
8. A sample of 40 women has upper leg lengths with a mean of 38.86 cm and a standard deviation of 3.78 cm. Estimate the minimum and maximum “usual” upper leg lengths for women. Is a length of 47.0 cm considered unusual in this context?
9. Heights of men have a bell-shaped distribution with a mean of 176 cm and a standard deviation of 7 cm. What is the approximate percentage of men:
a. between 169cm and 183cm?
b. between 155cm and 197cm?
c. between 169cm and 190cm?
d. above 162cm?
e. below 183cm?
10. Use the sample data listed below to find the coefficient of variation for each of the two samples, then compare the results.
Heights (in)of men 71 66 72 69 68 69
Lengths (mm) of cuckoo eggs 19.7 21.7 21.9 22.1 22.1 22.3 22.7 22.9 23.9
11. You are told that the sample size of a data set is and the standard deviation is . Write down an example of such a data set.
12. If a data set consists of longevity times (in days) of fruit flies, what are the units used for standard deviation? What are the units for mean?
13. A data set consists of 20 values that are fairly close together. Another value is included, but this new value is an outlier. How is the standard deviation affected by the outlier? No effect? A small effect? A large effect?
14. A doctor has to investigate the systolic blood pressure of people living in a small town. He randomly draws 3 samples from the population and his calculations show that all samples have the same mean value ( mm Hg), but different standard deviations: , , and . Match each of the 3 standard deviation values with one of the 3 graphs shown below. What could be the reason for such difference in SDs?
(a) (b) (c)
IV. Linear Regression and the Correlation Coefficient
1. Fourteen different second-year medical students took blood pressure measurements of the same patient and the results are listed below.
Systolic (mm Hg) / 138 / 130 / 135 / 140 / 120 / 125 / 120 / 130 / 130 / 144 / 143 / 140 / 130 / 150Diastolic(mm Hg) / 82 / 91 / 100 / 100 / 80 / 90 / 80 / 80 / 80 / 98 / 105 / 85 / 70 / 100
a. Does there appear to be a linear relationship between diastolic and systolic values?
b. Using your calculator, find the equation of the regression line.
c. In the context of this problem, explain the meaning of the slope and the y-intercept. Do they make sense?
d. Using your calculator, find the correlation coefficient, r.
e. Interpret the correlation coefficient with respect to the data. Comment on the sign and the strength.
f. Find the expected diastolic blood pressure of a person who has a systolic blood pressure of 125 mm Hg.
Statistics Review Problems for TEST Page 1 of 4