Chapter 24Solutions and Mini-Project Notes

CHAPTER 24SOLUTIONS AND MINI-PROJECT NOTES

CHAPTER 24

SIGNIFICANCE, IMPORTANCE AND UNDETECTED DIFFERENCES

EXERCISE SOLUTIONS

24.1a.Yes.

b.No, the magnitude wasn't much different in a practical sense, but the large sample sizes led to a statistically significant difference.

c.A statistically significant difference indicates that the difference in the population is not zero but does not indicate that it has any practical significance. The meaning should be clarified when the word is used.

24.2a.If the researchers started out with the idea that Internet use would either increase depression or not have an effect on it, then it would be appropriate to use one-sided tests. If they originally wanted to see whether or not Internet use had an impact on depression by either increasing or decreasing it, two-sided tests would be appropriate.

b.A type 1 error would occur if Internet use was not associated with greater loneliness, but the researchers concluded that it was. Since they did indeed make that conclusion, this is the type of error that could have been committed in this study. A type 2 error would occur if Internet use was associated with greater loneliness but the researchers failed to find the association.

24.3a.A type 1 error is that the vaccine is not effective but the testing goes forward, which would not be too serious. A type 2 error is more serious; the vaccine is effective but testing does not take place and it is shelved.

b.The tests on the small group may not have had enough power to detect the effectiveness of the vaccine, if it exists.

24.4a.It would be 3(1.637) = 4.911.

b.Null: There is no relationship between gender and driving after drinking in the population. Alternative: There is a relationship between them. The alternative would now be chosen because 0.03 is less than 0.05.

c.The power would be higher with the larger sample size.

24.5a.It could be that the power of suggestion is responsible for the change in behavior. Being given a drug by a psychiatrist may lead them to expect results.

b.Null: The drug fluvoxamine has no impact on compulsive shopping. Alternative: The drug fluvoxamine reduces the urge to shop in compulsive shoppers.

c.No. The results are "clear and dramatic" but we aren't told if they are statistically significant.

24.6a.It is useful to know the range of possible magnitudes for the difference.

b.It is useful to know how likely the sample difference would have been by chance.

c.The sample sizes would help us determine the likelihood of a type 2 error or the power of the test.

d.If multiple tests were conducted we would expect some to reach statistical significance even if nothing is going on.

24.7If "no difference" is found and the sample sizes are small, it is quite possible that the test had very low power and a type 2 error has occurred.

24.8a.A confidence interval for the relative risk of heart attack during heavy versus minimal physical exertion is from 2.0 to 6.0.

b.It means that a confidence interval for the relative risk covers 1.0, but without knowing the interval or the sample size that is not very helpful. It could be that there is indeed a difference but the sample was too small to detect it.

24.9Yes, the larger the sample the bigger the value of the test statistic (holding the effect constant) so the easier it is to reject the null hypothesis.

24.10Even if ESP was present in the experiments, the sample sizes were so small that it would be difficult to find a statistically significant effect. The power was low and the probability of a type 2 error was high.

24.11a.A relative risk of 1.0 indicates equal risk. It is contained in the interval.

b.Null hypothesis: There is no relationship between taking aspirin and suffering strokes. Alternative hypothesis: There is a relationship between taking aspirin and suffering from a stroke. Because the p-value is so high (0.41), we cannot reject the null hypothesis. There is no significant relationship between taking aspirin and having a stroke.

c.If there really is no relationship, as indicated by part b, then the relative risk would be 1.0, which is a potential value based on the confidence interval in part a.

d.It probably did not get much coverage because the relationship was not statistically significant. It could also be that the physicians wanted to downplay the risks involved with taking aspirin, given the obvious benefit in reducing heart attacks. But in general, studies that do not find statistical significance do not get as much press coverage.

24.12If all null hypotheses are true, one in 20 should be rejected by chance using a p-value of 0.05, so we would expect one.

24.13Food intake is not effectively ruled out just because there is no "significant difference." It would be helpful to know the magnitude of the difference.

24.14You would want to know how many people were tested and a confidence interval for the mean loss, rather than just the sample value.

24.15All we can determine is how unlikely the sample results would be when the null hypothesis is true. That tells us nothing about whether or not it is true.

24.16The methods depend on knowledge of what kinds of sample values we expect to get if everyone has an equal chance of being in the sample. In convenience samples those with certain values may have a higher chance than those with other values, so the rules for sample means, proportions, etc., would not apply.

24.17A little variability, because in that case observed differences would be less likely to be due to random variability.

24.18a.It would not have been fair reporting because readers may have interpreted “significantly greater increase” to mean that the increase was very large. The second sentence makes it clear that the word “significant” is being used in the statistical sense, not the English sense.

b.For the population of people who would volunteer for meditation training, there would be no difference in immune response to a flu vaccine for those trained to meditate and those not trained, eight weeks after the training.

24.19a.The alternative hypothesis was one-sided.

b.Yes, a one-sided alternative hypothesis is justified. Based on folklore and previous studies, the researchers were speculating that meditation would increase immune function response.

24.20a.Because the sample size was small, even if differences exist in the population the study may not provide enough evidence to detect them. The “statistical power” is the probability of being able to detect a difference based on a sample, given that it exists in the population.

b.The researchers found that differences in the sample of those trained to meditate and those not trained to meditate were in the direction predicted by their one-sided alternative hypotheses. However, the sample differences were not large enough to rule out chance as an explanation. The test statistics were not large enough to result in p-values less than 0.05.

24.21a.They don’t mean that the difference was exactly zero, they mean that it was not statistically significantly different from zero.

b.The difference in average ages for the two groups was not statistically significant.

24.22The warnings are not likely to apply to this study because the sample size was so small.

24.23a.8200 participants.

b.Here is the relevant quote: “For boys of all ages, it was about one-half point on a 33-point scale. Girls were hit harder, with a 2-point difference for girls who’d been 12 at the first interview, and diminishing with age to about a half-point difference for girls who’d been 17.” (See the Appendix of the book, page 538.)

c.The first two warnings apply. The sample size was very large, so a statistically significant difference was found even though the actual difference in depression levels was very small.

d.The word “significantly” is being used in the statistical sense, and not in the English sense. This can be seen by the magnitude of the differences quoted in part c of this exercise.

24.24The answer to this question should make a clear distinction between statistical significance and practical importance, and should quote the magnitude of the difference in depression scores found in the sample.

NOTES ABOUT THE MINI-PROJECTS FOR CHAPTER 24

Mini-Project 24.1

The most common mistakes in news reports are implying a cause-and-effect relationship based on an observational study, and concluding there is no difference or effect when there is no statistically significant one. Most likely the example will fit one of these two scenarios.

Mini-Project 24.2

If a statistically significant result is found with a small sample, then the real effect is probably quite strong. The larger the sample the easier it would be to find a significant result, and the project should discuss that idea in the context of the two examples.

Page 1 of 4