Healey's Statistics: A Tool for Social Research Seventh Edition

Question and Answer Proofreading

The main discrepancy with the answers in this book is the formulas used are not standard in statistics. Some of mentioned in more detail below but the formulas for standard deviation (and so, variance) are different which leads to discrepancies in nearly every other formula used.

The notation used is not standard either: the most notable (and potentially confusing) is the use of mean ± standard deviation but as numbers without any reference to what it means, for example 10 ± 1.5. Without any explanation, this means nothing to someone with no knowledge of statistics and to someone with some knowledge, it means a mean ± standard error.

Hypotheses are often not correctly given (if given at all) which leads to incorrect conclusions – “accepting the null hypothesis” is frequent, not in those words but in the conclusion.

The z test for means is used for large samples when in reality the z test for means should be used when the population standard deviation is known (not often) and the t test should be used when only the sample standard deviation is known. The confusion often arises because the tests are similar when the sample size is large.

Note: Not having the data sets for the computer questions, I was unable to complete these.

Chapter 1

Multiple Choice:

Answers all correct. Note that for #11, (d) is only correct because the question asks for the “best” answer. Otherwise, (a) would be an OK answer.

Work Problems:

All answers are correct

Chapter 2

Multiple Choice:

Answers all correct. However #2 has poor wording for the correct answer. “Level of measurement” would be better expressed as “Type of measurement”.

Work Problems:

#2. The bin width for the frequency table need to be given in the question. The bin width used (2) in the answer given is to small – 4 or 5 would be better. Same for question #3.

#10 Misprint in calculation in answer (says 1850 instead of 1875) but final answer is correct.

Chapter 3

Multiple Choice:

Poor wording in many of the questions leads to confusion.

#5 states the mean is the best measure. However if there are outliers, the median is the better measure.

#6 “the middle-most score? Why not “the middle score of the ordered data”?

#8 the mode is a type of average (along with the median and mean) so this question has 2 answers.

#12 Need to add “For the variable Number of years married previously used in the text……..”

so that it refers back to a known data set.

#13 How is satisfaction of marriage measured?

#15 Answer is wrong – the distribution is not symmetric so the median should be used as the measure of central tendency.

Work Problems:

#3 should use median because data set is skewed.

Poor notation: s notifies sample standard deviation, s signifies population standard deviation. See note at the beginning about formulas and notation.

Chapter 4

Summary

The equations for sample standard deviations are incorrect. The denominator should be n-1 (where n is the sample size) Incidentally, the guide uses N for the sample size, in Statistics, N is more commonly used for the population. If the sample size is large the difference between n and n-1 is insignificant, but may account for slightly different values for any calculations later in the guide involving standard deviations.

Multiple Choice:

#2 The second paragraph in the summary clearly states that dispersion indicates how different the rest of the sample is from the mean (answer (c)) and the first paragraph states that these measures show us how much variation there is in the data. (answer (a))

#14 and 15. This is a major problem!! 76 ± 7 is not standard notation for mean and standard deviation in statistics (unless given along with the measurement). For someone with no knowledge of statistics, 76 ± 7 is meaningless, and for someone who is used to statistical notation, ± 7 is a measure of standard error (±s/Ön). This crops up over and over again in this review and could lead to major confusion.

Work Problems:

Apart from the problems discussed above, the answers are correct.

Chapter 5

Multiple Choice:

#14 The question needs to say “What percent of the sample lies within two standard deviations from the mean?”

#15 I don’t have Appendix A but 50% of the data lies above the mean, so I think the answer is wrong and the answer should be (a)

Work Problems:

#1(c) should be -.045

#3 should be 61.8

#4 “how far” is not defined. Is it in years? in number of standard deviations? % of the other executives? Question needs rewording.

#5 “how much younger” is not defined. in years? in number of standard deviations?

#5 - #15 Either in the question or the answer it must be stated that the exam score follow an approximately normal distribution. Otherwise the answers don’t follow.

#10 and 11 “Number of children” is discrete data and follows a binomial distribution. The normal approximation to the binomial has not been discussed in the review and is not mentioned in the question so I am assuming this has not been covered.

#12 “How far away” is not defined. in miles? in # standard deviations? expressed as a z-score? relative to what? Needs rewording.

#14 Does she go to the same university as mentioned in #13? If so, it needs to be stated.

#15. “How far apart” is ambiguous. “What percentage of the other applicants” would be better.

Chapter 6

Multiple Choice:

All answers are correct.

Work Problems:

All computer applications

Chapter 7

Multiple Choice:

All answers are correct

Work Problems:

# 1a to 1e no answers for 99% C.I. given although asked for in question.

Strange formulas used throughout.

Used for means: for proportions :

More accepted formula:

for means: for proportions:

#9 Answer is wrong.

Correct answer: The interval gets narrower. It becomes 8 ± .17

Chapter 8

Summary

In the last paragraph you refer to alpha and beta errors as “sticky little buggers”. I don’t know if you are planning on selling this guide in Europe, but you need to know that this is a terrible profanity in England and will affect your sales (at best) or be censored (or worse). I suggest you reword this phrase!

See notes at beginning as to when to use z test and when to use t test.

Multiple Choice:

All answers are correct

Work Problems:

Ø Again, non-typical formulas are being used.

Used for the t-test:

More accepted formula: For large values of n, this won’t make much difference, but for small values, it could.

Ø For all questions/answers:

No hypotheses were given.

No assumptions of a test are given.

No significance level was given in the question. As the critical region was used, not the p-value, a significance level is needed (.05 was used, but not specified)

#2. Calculation (even using the formula given) = 23.92 (not 24.12)

#3. Calculation (using numbers given) = 10.86 (not 13)

#5 Calculation (using numbers given) = 4.1 (not 4.0)

#11 concludes “There is no significant difference between the AIDS patients and the rest of the population on the variable quality of life” It is not possible to conclude this. All we can conclude is: “There is not enough evidence to show that there is a significant difference between the AIDS patients and the rest of the population on the variable quality of life”

#15 Don’t use the term “on average” for a question on proportions. Unfair to student!

No such thing as “too much time” in the library! Need to say “more time than the other students”.

Chapter 9

Multiple Choice:

All answers are correct

Work Problems:

See notes at beginning about conclusions to hypotheses tests

#1 The question should be: Is there a difference in the population mean income between men and women?

#3 The question asks if one mean is higher (implies a one tailed) and then asks if the difference is significant (implying a two tailed). Confusing!

#8 The z-value given of 0.50 is incorrect (should be 1.10) but there is no formula given in the summary to check. The conclusion of no difference is incorrect, again the correct conclusion is that there is no evidence of a difference between the Bolivian and Peruvian women.

#11 The critical value (either z or t) should be -2.9 not -1.34.

#13 The critical value (either z or t) should be -5.6 not -2.59.

#15 Don’t use the term “on average” for a question on proportions. Unfair to student!

Chapter 10

Multiple Choice:

All answers are correct but:

in # 11 the answer (d) is “one of the populations means is different” The answer should say “At least one of the populations means is different”

# 12 doesn’t make sense!

Work Problems:

Lots of problems and too many to list – mainly due to the fact that the formula for standard deviation is incorrect. Also – no-one does ANOVA by hand with the availability of computers and calculators. I could check the arithmetic, but this doesn’t seem a good use of your resources!

Chapter 11

Multiple Choice:

# 11 should be answer (d) Chi-squared is affected by all of the above

#14 (c) “beyond” is the correct answer, but “to the right of” or “greater than” might be a better expression.

Work Problems:

No hypotheses were stated at the beginning of any of the answers

In all questions, no significance level is given, although a = 0.05 appears to be used.

In all questions where the calculated value lies outside the rejection region: The conclusion is given as “A and B are independent” This cannot be shown and is incorrect. The correct conclusion should be: “There is insufficient evidence to conclude that A and B are dependent”

#6 Large sample sizes shouldn’t be a problem! Small sample sizes (Expected value of a cell less than 5) can make the test invalid, but large sample sizes do not cause a problem.

Chapter 12

Multiple Choice:

All answers are correct

Work Problems:

#7 Although the association is weak, perhaps it should be mentioned that it is a negative association?

#10 states “Farm residents appear more likely to approve the water plant than town residents (61% vs. 53%).” Should be 61% vs. 47%

Chapter 13

Multiple Choice:

All answers are correct

Work Problems:

I have to admit I have never seen these formulas before and as not all of them were given in the summary, I was unable to check them all. However, of those I was able to check, there appeared to be no problem.

Chapter 14

Multiple Choice:

All answers are correct

Work Problems:

No problems detected

Chapter 15

Multiple Choice:

#6 add: in the form y = a + bx (sometimes y = ax + b is used, in which case the value “a” would have a completely different meaning)

#12 There are significance tests for other than the linear relationship in the populations. The answer could be (a) as well.

#14 Y’ is not the commonly used symbol for predicted value of Y. is more commonly used.

Work Problems:

For all questions:

The question being asked should be “Is there a linear association?” the value for correlation coefficient is only appropriate for a linear association.

A distinction should be made between a positive linear association and a negative linear association (as described in the summary)

When describing the meaning of the slope, (for example as in question #1: For every one hour that a person studies, we predict an increase in his/her grade of approximately 1.85 points.) the prediction should be for an average increase of his/her grade of approximately 1.85 points

The symbol for X-bar () is correctly given but the symbol for Y-bar is given as Y in all answers.

#1: åX2 = 808, åXY = 7883, a = 65.6, b = 1.97, r = .61, r2 = .37

#2: r2 = .40 (square the value of r before rounding, not after)

#4: r2 = .31 (square the value of r before rounding, not after)

#5: r2 = .90 (square the value of r before rounding, not after)

#6: r2 = .33 (square the value of r before rounding, not after)

#7: r2 = .17 (square the value of r before rounding, not after)

#8: r2 = .74 (square the value of r before rounding, not after)

#10: r2 = .33 (square the value of r before rounding, not after)

#11: r2 = .50 (square the value of r before rounding, not after)

Chapter 16 and 17

Multiple Choice:

All answers are correct

Work Problems:

The main questions in these chapters are completed using computers or calculators. Other than calculations using different formulas for standard deviation, the answers given are correct.