Part 1. For each of the following questions fill-in the blanks. Each question is worth 3 points.

  1. When researchers look for a relationship between two categorical variables for individuals in the ______, they measure those categorical variables on individuals in the ______.
  1. Another name for the alternative hypothesis is the ______hypothesis.
  1. The ______hypothesis is usually written to express the fact that ‘nothing is happening.’
  1. A ______test is a statistical procedure that is used to determine whether or not there is a relationship between two categorical variables.
  1. Using the relative frequency approach, we can define the probability of any specific outcome as the ______of times it occurs over the long run.
  1. The ______represents the average value of any measurement over the long run.
  1. For a 95% confidence interval, the value of 95% is called the ______.
  1. When a relationship or value from a sample is so strong that we can effectively rule out chance as an explanation, we say that the result is ______.

Part 2. For each of the following questions circle the correct response. Each question is worth 3 points.

  1. Which of the following statements is true about chi-square tests?
  2. A large chi-square test statistic results in a large p-value.
  3. A large p-value means that there is a good chance that the relationship is statistically significant.
  4. If the two variables are not related in the population, then less than 5% of the samples you could ever take would give you a test statistic of 3.84 or larger.
  5. All of the above.
  1. Which of the following is a true probability?
  2. -.22
  3. 120%
  4. 1
  5. None of the above
  1. Suppose the outcomes of births within a given family are independent of each other, and a couple has already had four boys. Which of the following best describes the probability that their next baby will be a girl?
  2. Approximately 50%
  3. Much less than 50%
  4. Much greater than 50%
  5. Not enough information to tell
  1. Which of the following statements is false?
  2. Sample results will always be very close to their respective population values.
  3. Sample results vary from one sample to the next.
  4. The key to interpreting statistical results is to understand what kind of dissimilarity we should expect to see in various samples from the same population.
  5. None of the above statements are false.
  1. Which of the following is a correct interpretation of a 90% confidence interval?
  2. 90% of the random samples you could select would result in intervals that contain the true population value.
  3. 90% of the population values should be close to our sample results.
  4. Once a specific sample has been selected, the probability that its resulting confidence interval contains the true population value is 90%.
  5. All of the above statements are true.
  1. What does it mean for a confidence interval for the difference of two means to contain zero?
  2. You are unable to say there is a difference in the population means.
  3. Different samples could give results in either direction; completely above zero, completely below zero, or containing zero.
  4. The confidence interval will contain some negative numbers and some positive numbers.
  5. All of the above.
  1. Suppose a confidence interval for the difference in mean weight loss for two different weight loss programs (Program 1 – Program 2) is entirely above zero. What does this mean?
  2. We can’t say with any confidence that there is a difference in mean weight loss for the populations of people on these two programs.
  3. We can say with confidence that there is a difference in mean weight loss for the populations of people on these two programs; further, we can say that the average weight loss on Program 1 is higher.
  4. We can say with confidence that there is a difference in mean weight loss for the populations of people on these two programs; further, we can say that the average weight loss on Program 2 is higher.
  5. None of the above.

Part 3. For each of the following questions give a short answer. Use complete sentences. Each question is worth 3 points.

  1. Suppose you want to investigate whether there is a relationship between the gender of college students and whether or not they wear hats in school. What would be your null hypothesis and your alternative hypothesis (in words)? Be sure to label clearly which hypothesis is which.

.

  1. The airlines routinely report their on-time flight percentages, which can be interpreted as probabilities. What method of finding probabilities was most likely used in determining this?
  1. Tell whether the following statement is correct; if it is not correct, explain the problem. “If the probability of a single birth resulting in a boy is .51, then the probability of it resulting in a girl is also .51.”
  1. Which would be wider, a 90% confidence interval or a 95% confidence interval? (Assume both of them were calculated using the same sample data.) Explain your answer.

Part 4 Make sure to show all work in the following questions!

  1. One of major hospitals in the country conducted a study to check if there is a relationship (association) between pet ownership and survival after major surgery. 92 patients were followed after major surgeries , they were classified as pet owners or not and their survival status after one year was determined. The data obtained by the hospital is summarized in the table below:

Pet (YES) / PET(NO) / Total
Dead / 50 ( ) / 28( ) / 78
ALIVE / 3( ) / 11( ) / 14
Total / 53 / 39 / 92

Does the data provide evidence that there is a relationship between pet ownership and

survival after major surgery?

a.(3 points) Formulate appropriate Null and Alternative hypotheses to be tested

b. (3 points) Compute the Expected Counts in your table under the assumption of no

association and fill them in ( ) in the table.

  1. (3 points)Compute the value of a Chi-square test statistics.

d. (3points) Decide if Null hypothesis should be rejected or not, explain your decision and clearly answer question posed in the problem.

  1. (3 points) Suppose a class of 100 students took their statistics final and their grades are shown in the table below. Choose one student at random. What is the probability that he/she received a B or a C?

A / B / C / D / F
25 / 28 / 34 / 10 / 3
  1. (3 points) Suppose the chances of picking up a cold from someone by shaking hands with them is .02 (assuming you don’t know whether they have a cold or not), and that each encounter you have is independent of another. Suppose you shake hands with 5 people in a given day. What is the probability that you don’t pick up a cold from any of these people?
  1. (3 points) Suppose an “Instant Lotto” ticket costs $3, and the chances of winning the $80 prize are 1/1000 There are no other prizes. What is your expected value for this game for each ticket you buy?

Give an interpretation of your answer.

  1. Suppose numerous random samples of size 2,500 are taken from a population made up of 20% cell phone owners.

a. (3 points) What is the approximate shape of the frequency curve made from proportions of cell

phone owners from the various samples of size 2,500 from this population ? Give mean

and standard deviation of that curve.

b. (3 points) Suppose you took a random sample of size 2,500 from this population and found that

17.6% of them owned a cell phone. Is this considered to be a reasonable value given the

size of this sample? Use the standardized score in your answer.

  1. Suppose that test scores on a particular exam have a mean of 77 and standard deviation of 5, and that they have a bell-shaped curve. Suppose you take numerous random samples of size 100 from this population.

a. (3 points) Describe the shape and give the mean and standard deviation of the resulting frequency

curve.

b. (3 points) Suppose you take a single random sample of size 100 from this population, and you get a mean test score of 78. What is a chance of observing a value like that or larger? Use a standardized score to justify your answer.

  1. Suppose a shipment of oranges is advertised to weigh 5 pounds per bag. We know that not every bag can contain exactly 5 pounds of oranges. We decide to take a random sample of 100 bags of oranges and find out what they tell us about the population of all bags in this shipment. We are only interested in whether or not the bags are underweight, so each bag is weighed and counted as underweight if it weighs less than 5 pounds. Five bags in our sample of 100 were found to be underweight.

a. (3 points) Compute a 95% confidence interval for the proportion of bags in the shipment that are underweight?

b. (3 points) Suppose the grocery store who ordered the oranges will reject the shipment if they believe, based on these sample results, that more than 10% of the bags in the entire truckload are underweight. Based on our sample, will they have to return this shipment? Explain your answer.

  1. The introductory biology class at a large university is taught to hundreds of students each semester. For planning purposes, the instructor wants to find out the average amount of time that students would use to take the first quiz, if they could have as long as necessary to take it. She takes a random sample of 100 students from this population and finds that their average time for taking the quiz is 24 minutes, and the standard deviation is 16 minutes.

a. (3 points) Find a 95% confidence interval for the average time to take this quiz for the whole population of students who take the class.

b. (3 points) Suppose the professor expects the average time to take the exam is 23 minutes. Do you have enough evidence to say that the professor is wrong in her estimation of the average time to take this quiz? Base your answer on the confidence interval you obtained in part a.