Statistics – Final Exam Review 3 Name: ______
True/False [If false, explain why]
- Descriptive Statistics are used to extend or extrapolate upon known statistical conclusions and make decisions or predictions about the world around us or about the future.
- Having a random sample to represent the population of interest is not really an important aspect of statistics.
- When the population of college professors is divided into groups according to their rank [instructor, assistant professor, professor, adjunct, etc.] and then several are randomly selected from each group to make up a sample, the sample is called a systematic sample.
- The variable temperature [measured in degrees F] is an example of a quantitative variable.
- The amount of weight that a bridge can support is an example of a continuous variable.
Multiple Choice
- A researcher selected subjects from two groups according to geographic location. Once the groups were randomly selected, all subjects from each group were studied. What sampling method is used here?
- Random
- Cluster
- Systematic
- Stratified
- Data that describes average temperatures in the Northeast Region are measured on what scale?
- Nominal
- Ordinal
- Interval
- Ratio
- A study that involves no researcher intervention is called:
- An experimental study
- A quasi-experimental study
- A non-involvement study
- An observational study
- A variable that is outside of the controlled experiment and may interfere with other variables or outcome is called:
- An explanatory variable
- An independent variable
- A dependent variable
- A confounding variable
- The following bar graph gives the percent of owners of three brands of trucks who are satisfied with their truck.
From this graph we may legitimately conclude that:
- Owners of other brands of trucks are less satisfied than the owners of these three brands
- Chevrolet owners are substantially more satisfied than Ford or Toyota owners
- Chevrolet probably sells more trucks that Ford or Toyota
- There is very little difference in the satisfaction of owners of the three brands
- A pie chart would have been a better choice for displaying these data sets
- Classify each variable as continuous [C] or discrete [D]:
- The average salary of financial advisors______
- Number of cups of coffee served at a restaurant______
- The time it takes a student to drive to school______
- Give the boundaries for each:
- 32 minutes
- 0.48 mm
- 6.2 Newtons
- According to a pilot study of 15 people conducted at Rutgers University, daily doses of “Compound X” over a period of 6 months resulted in significant increase in energy levels. Why can’t it be concluded that the compound is beneficial for the majority of all people?
- A study of 2598 collegiate soccer players showed that in 46 anterior cruciate ligament (ACL) tears, 36 were in women. The researchers titled their article in which the study was published as:
“Female Athletes Tear Knees More Often Than Male Athletes”
Is this a legitimate conclusion from the study? Explain your response as specifically as possible.
- In the previous scenario, identify the following:
a. What was the quantitative variable in the previous study?
b. Is this variable continuous or discrete? ______
c. The level of measurement for the variable was ______
d. Population of interest.
e. Type of study [experimental or observational]. How do you know?
- If Mr. Chandler’s statistics class displayed a LEFT SKEWED distribution on the first test, that would mean that many students scored exceptionally well and only a few students scored very poorly. In this case, would Mr. Chandler be better off reporting the class MEAN or MEDIAN if he wanted to make his students look good? Explain your reasoning.
- Imagine you are on an experimental team involving treatments for minor shoulder injuries pertaining to the rotator cuff. You are interested in the effectiveness (measured by the return of strength and stability to the joint) of three different treatments in subjects. The three treatments are: physical therapy (strengthening of the surrounding muscles), arthroscopic surgery (clean out/shave for a more stable ligament), and a combination of massage therapy & acupuncture for an increased blood flow to the area (promotes healing).
- Design a flowchart depicting an experimental design for this statistical study.
- What was the explanatory variable in this experiment?
- What was the response variable in the treatment?
- What is a possible confounding variable in this experiment?
The following questions relate to the graph below representing a data set that was collected using a SRS technique.
- Which type of vehicle has the best overall fuel efficiency? Explain.
- Is your choice for question 29 also the most consistent type of vehicle with respect to fuel efficiency? Explain.
- A biologist assumes that there is a predictable relationship between the amount of fertilizer supplied to tomato plants and the yield [production] of tomatoes obtained. Eight tomato plants of the same variety were selected at random and treated [weekly] with a solution in which x grams of fertilizer was dissolved in a fixed quantity of water. The yield y, in kilograms, of tomatoes was recorded.
PLANT / A / B / C / D / E / F / G / H
Fertilizer Quantity [g] / 1.0 / 1.5 / 2.0 / 2.5 / 3.0 / 3.5 / 4.0 / 4.5
Tomato Yield [kg] / 3.9 / 4.4 / 5.8 / 6.6 / 7.0 / 7.1 / 7.3 / 7.7
- You want to predict tomato yield from the fertilizer quantity. There is a clear “explanatory-response relationship here. Which variable is the explanatory and which variable is the response?
Explanatory: ______
Response: ______
- Make a scatterplot [detailed and well-labeled] of the data and report the correlation r.
- Based on the results from part (b.), make a brief statement about the direction, form, and strength of the data.
Direction: ______
Form: ______
Strength: ______
- Report a statistical model [LSRL equation] that could be used to predict yield of tomato from fertilizer quantity.
- Estimate the yield of a plant treated [weekly] with 3.2 grams of fertilizer.
- Report the coefficient of determination the correct way. Is this a good model?
- Calculate the residual for plant D [situation with 2.5 g of fertilizer a week].
- Why is it probably not appropriate to use your equation to predict the yield of a plant treated [weekly] with 20 grams of fertilizer?
- Sarah’s parents are concerned that she seems short for her age. Their doctor has the following records of Sarah’s height:
Age [months] / 36 / 48 / 51 / 54 / 57 / 60
Height [cm] / 86 / 90 / 91 / 93 / 94 / 95
a. Find the correlation and describe the form, direction, and strength of this relationship in context
b. find the equation of the mathematical model of height vs. age
c. calculate Sarah’s height by using this equation at 40 months and at 60 months.
d. Calculate the residual for an age of 60 months
e. what is Sarah’s rate of growth, in cm per month? [hint – this is the slope]
f Normally growing girls gain about 6 cm in height between the ages of 4 and 5 [48 and 60 months]. What rate of growth is this in gm per month?
g. is Sarah growing more slowly than normal?
h. what is the r^2 value of this model? How accurate is this model in predicting height from age?
i. predict Sarah’s height when she is 8 years old. Will this result be as accurate as the other predictions? Explain.
- What is the difference between a statistic and a parameter?
- Fill in the blank: .
- Describe the meaning of the “50th percentile”
- Which measure of “central tendency” is the 50th percentile equal to? ______
- Explain the basic meaning of standard deviation.
- Fill in the blanks:
- As the standard deviation increases, the distribution becomes ______.
- As the standard deviation decreases, the distribution becomes ______.
- The only time standard deviation can equal zero is when ______
- Describe the relationship between mean and median for a symmetric distribution [Normal/Bell-Shaped], right skewed distribution, and left skewed distribution.
- Find the z-scores for the following tests and state which one the student did better on relative to his/her classmates.
Test A / Score = 59 / Mean = 48 / S = 5
Test B / Score = 80 / Mean = 75 / S = 10
- The top senior high school 3-point shooters in the county are listed for the graduating class of 2005, 2006 and 2007 [along with their percentage of made attempts]. List the shooters in order starting with the best relative to his graduating year [use z-scores].
2005: SMITH (.605)
2006: PETRELLO (.595)
2007: WAGNER (.610)
YEAR / MEAN (% Made) / STANDARD DEVIATION2005 / .470 / .0520
2006 / .465 / .0500
2007 / .475 / .0560
- A study of the size of jury-decided awards in civil cases (such as injury, product liability, and medical malpractice) in Chicago showed that the median award was about $9000. However, the mean award was about $68,000. Explain how a difference this big between these two measures of center can occur.
- The probability for any event E must be between ______and ______.
- Which events represent classical probability [C] and which events represent empiricle probability [E]?
* The probability of getting a royal flush when 5 cards are selected at random is .0002%____
* The probability of getting into an accident on “Highway A” is about 7%____
* The probability of it raining this week will be about .20____
- If a die is rolled one time, find these probabilities:
* Getting a 6
* Getting a 3 or an even number
* Getting an odd number greater than 5
- In a statistics class, there are 10 juniors and 12 seniors; 6 of the seniors are females, and 8 of the juniors are males. If a student is selected at random, find the probability of selecting the following:
* A junior or a female
* A senior or a female
* A junior or a senior
- Show on two different Venn diagrams a visual of mutually exclusive events and “non-mutually exclusive events”
- Is flipping a coin 3 times an independent or dependent event? ______
- Is picking a card without replacement an independent or dependent event? ______
37. A junk box in your room contains a dozen old batteries, five of which are totally dead. You start picking batteries one at a time and testing them. Find the probability of each outcome….
(a) The first two you choose are both good.
(b) At least one of the first three batteries you choose works.
(c) The first 4 you pick all work.
(d) You don’t find one that works until the 5th battery.
- Given P(A) = .5 and P(B) = .3. Events A and B are independent and not mutually exclusive. Find the following:
- P(A n B) =
- P(A l B) =
- P(A u B) =
P (A) =
P (B) =
P(AuB) =
P(AnB) =
P(A and ) =
P(B and ) =
P( n ) =
- What is the total area under a normal distribution curve?______
- Z = 2.5 means that this location is 2.5 standard deviations above the ______.
- What is the area under the standard normal distribution curve to the RIGHT of z = 0?______
- Z-values that correspond to a number below the mean are always [positive or negative]. Circle One Answer
- Approximately what percentage of normally distributed data falls within 2 standard deviations above and below the mean?
- Draw a visual of a normal distribution and label the correct location for the mean, median, and mode.
- In the following distribution, tell what kind of skew there is [left or right] and then label the correct location for the mean, median, and mode.
- Find the area under the standard normal distribution for each:
- Between 0 and 1.25
- To the left of -0.65
- To the right of 1.50
- Using the standard normal distribution, find each:
- P(-1 < z < 2)
- The z-score that has 84% of data below that value [84th percentile]
- National SAT scores for the graduating class of 2013 were approximately normally distributed with a mean [Verbal and Math] of 1028 and standard deviation σ = 92.
- Princeton admits students who score in the 98th percentile [98% of scores at or below] each year. What score did a student have to achieve to be admitted into Princeton for the Fall 2013 semester?
- What is the probability that any given student scored above the mean, but no greater than a score of 1200?
- An automobile dealer finds that the average price of a previously owned vehicle is $8250. He decides to sell cars that will appeal to the middle 60% of the market in terms of price. Find the maximum and minimum prices of the cars the dealer will sell. The standard deviation is $1125, and the variable is normally distributed.