Chapter 13 Collecting Statistical Data

Chapter 13 Collecting Statistical Data

Chapter 13 – Collecting Statistical Data

13A All married people or all married people who read Dear Abby. Though the second answer is correct it is clear that she is trying to draw conclusions about all married people. B. 210,336 C. self-selection D. 85% it is a statistic because the data is based on a sample

14A Dear Abby readers are far from a representative sample of all married people. Also the sample is self-selected so only people who bother to respond were in the sample. B. A very small percentage of Dear Abby readers bothered to respond.

15A 74% B 81.8% C. Accuracy is questionable because the sample is not representative of the target population. One reason is that the sampling frame is so different from the target population. Another reason is that even if the sampling frame was the same as the target population, the survey is subject to nonresponse bias.

16. Problem 1: Define the population. Since the population consists of all married people one would have to get a list off all married people. This could be compiled from state records. Problem 2: Getting a representative population. Simple random sampling would be the best way to go about this. A sample size using this method could be 1500-2000 people. Problem 3: Getting truthful answers. How the question is asked is critical. Many respondents would be reluctant to answer unless complete confidentiality can be guaranteed. Mail in questionnaires are likely to produce high nonresponse rates and telephone interviews are almost guaranteed to produce a lot of untruthful responses. Therefore personal interviews are the only reasonable alternative. These interviews should be held somewhere other than the home. Couples should be interviewed separately. Males should be interviewed by males and females by females.

17A Citizens of Cleansburg B. The sampling frame is limited to that part of the population that passes by a city street corner between 4 and 6 p.m.

18A 475 B. There were 475 respondents and 1313 nonrespondents. So response rate is 475/(1313+475) which is approximately equal to 26.6% Only 26.6% of those people asked to participate actually did.

19A The choice of street corner could make a big difference (i.e. business district, school area, shopping area) B. D – we make the assumption that people who work downtown are likely to answer yes C. Yes for two reasons: people on the street between 4-6 pm are not representative of all citizens of Cleansburg i.e. office workers are more likely to be in the sample than homemakers, teachers or blue collar workers and the use of just five street corners may not produce a representative cross sections of Cleansburg D. No No attempt was made to use quotas to get a representative cross section.

20. Sending people out to interview passersby at a fixed location is not a good idea. Fixing a particular time of day makes it even worse. A telephone poll based on a random sample would probably produce more accurate data. Before asking the question the interviewer should ask introduce themselves as representing the city planning department and tell the interviewee that the survey consists of one question which will be used by the city to make important planning decisions.

21A. The target population and the sampling frame are all undergraduates at TasmaniaState. B. N=15,000

22A. n = 150 B. 1/100 = 1%

23A. No. In simple random sampling any two members of the population have an equal chance of being chosen. In this sample two people with the same last name (unless it is really common) have no chance of both being in the sample. B. Sampling variability. The students sampled appear to be a cross section of all TSU students who would enroll in Math 101.

24. A. The proportion of students in the sample who said they were unable to enoll in Math 101 is Assuming that those that responded is representative of the population, we estimate that TSU undergrads tried but were unable to enroll in Math 101 that semester. B. The only possible flaw with this survey is the size of the sample is rather small. A larger sample would produce more accuracy. However, a larger sample would cost more money and if the sample is chosen in a truly random style, a small sample size can give quite accurate results. Overall, it is a good survey.

29. 2000;

35. So rounding to the nearest thousand gives an estimate of 160,000 sturgeon in the Lake of the Woods.

36. So rounded to the nearest hundred thousand gives an estimate of 1.2 million carp in UtahLake in 2004.

39A. Anyone who could get a cold and would buy the vitamin. So all adults. B. The sampling fram is only a small portion of the target population. It only consists of college students in the San Diego area that are suffering from colds. C. Yes. The sample would likely under represent older adults and those living in colder climates.

40. A. No. There was no control group. B. Many problems exist. Among them: 1. Using college students because this is hardly a group that is representative of the entire adult population. Most college students are between the ages of 18-22. 2. Using subjects only from San Diego. 3. Offering money as an incentive to participate. 4. Allowing self-reporting. This is an unreliable way to collect data. 5. No control group.

41. Yes. All subjects knew they were getting the treatment so the placebo effect is likely. B. San Diego residents are not typical of the entire population. No one city would ever be representative of the whole country. San Diego’s climate makes it a particularly poor choice because there are very few parts of the country with a similar climate. Other problems are that the volunteers were paid, the subjects themselves determined the length of their cold and there was no control group.

42. Some suggestions to make this a better survey are: 1. choose subjects randomly from the population. 2. Divide subjects into a treatment group and a control group. 3. Have a trained professional determine the length of a cold. 4. Do a double blind study so that neither the patient nor the nurse know who is receiving the treatment and who is receiving the placebo.

47. (iv) clinical study. Though it is not a very good clinical study this is the best choice because the professor has a hypothesis (500 milligrams of caffeine each day will improve students’ performances). He then tests his hypothesis. It cannot be a randomized controlled experiment because there is no treatment and control group. It is not double blind because the professor knows who is receiving the treatment. It cannot be a controlled placebo experiment because there is no control group and no one is receiving a placebo.

48A. The target population is all college students. The sample was the students who were invited to come to the professor’s office for “individual tutoring”. B. 13 C. Sample size is way too small. There are millions of college students in the country.

49A. The study was neither. The students knew they were getting caffeine despite the fact that they had no idea that they were subjects in a study. B. Confounding variables are the individual tutoring they were receiving, the individual attention from the instructor and the students were going to the instructor’s office three times a week so they were most likely spending more time studying the subject.

50. If the professor wants to test his hypothesis he should have randomly chosen students to participate in his study. He then should form a control group and a treatment group where all students were served coffee but some of the coffee was decaffeinated. He also should have made the study a double blind study.

55. A. Spurlock’s study was a clinical trial since a treatment was imposed (eating three meals a day at McDonald’s every day for 30 days) on a sample of the population. B. The target population is the average American. C. The sample consisted of one person (Morgan Spurlock). D. Three problems with this study that indicate poor design (there are several) are 1) the use of a sample that is not representative of the population 2) a small sample (1 person) and 3) the lack of a control group in which a sample of average Americans curtailed their physical activity and ate the same number of calories as the treatment group.

56. A. Morgan’s treatment group consisted of Merab Morgan. B. He may have exercised more than usual during the study. Meal plans of less than 1400 calories a day would likely cause the average person to lose weight regardless of what they ate. The choices of food that he ate at McDonald’s may not have been those of the average American. C. Since the experiment is not legitimate so not many legitimate conclusions can be drawn. The only possible conclusion might be that when making wise dietary choices it is possible to lose weight while eating at McDonald’s.

60. A. Association but not causation B. placebo effect C. sampling variability D. selection bias

64A. People are unlikely to tell the IRS that they cheated on their taxes. B. The critical issue is to get respondents to give an honest answer. To do this the survey should be conducted by a neutral organization. A mailed questionnaire is the surest way to guarantee anonymity. This would result in some nonresponse biasbut this bias is preferable to the alternative which would be untruthful answers. The nonresponse bias could be reduced by offering an inducement to those that show proof of response.

65A. Under method 1, people whose phone numbers are unlisted are ruled out from the sample. However method one is cheaper and easier to implement than method two. B. Method two is likely to produce more reliable data because people with unlisted numbers may likely be the kind of person who would consider buying a burglar alarm. Also listing bias is more likely in places like New York City. Larger cities have a higher percentage of unlisted phones than rural areas or small towns.

66A. 900 numbers are a huge source ofselection bias. People who respond generally represent extreme viewpoints. Even though $0.50 is not a lot of money, many people are unwilling to spend the money if they do not have a particularly strong opinion about an issue. B. How reliable can a survey about the conduct of the news media be when the survey is conducted by a news media organization. C. Both surveys would produce unreliable results. D. Survey two is probably a bit better because the 900 number was not used.