Percentages, Graphs and Measures of Central Tendency

Full file at

CHAPTER 2

Percentages, Graphs and Measures of Central Tendency

A: SUGGESTIONS FOR CLASS ACTIVITIES

Activity: The Mean is still the Mean

Although in the text the decision was made to use the symbol M for the

arithmetic mean, point out to the class that this is not the only way the mean

can be expressed. Although most of the Education and all of the Psychology

journals now being published, use M for the arithmetic mean, there continues to

be a small number of statistics texts that are still using X with a bar across

the top. Students may at first be confused by this inconsistency, but once it

is pointed out that both symbols mean exactly the same thing, the mean is the

mean is the mean, the level of possible frustration should be reduced. In

fact, you may wish to accept either symbol.

Activity: Assessing Central Tendency: What do both the Median and Mean mean?

Perhaps the most glaring trap awaiting students who misunderstand central

tendency is the confusion that arises between the use of the mean and the

median. Both measures, to be sure, provide information regarding how the

average or typical subject performed, but in certain situations the use of one

of these measures rather than the other can create an extremely inaccurate

portrayal of centrality.

Activity: Your Students as Statistical Consultants

Ask your students to assume that have been selected as statistical

consultants and have been given the following scores on a standardized test of

reading ability test (where a score of 100 indicated normal progress):

110

109

108

107

X=772

For this distribution, then, X = 772, and the mean of X = 772/8 = 96.50.

On the basis of this mean value of only 96.50 it would seem that on average the

group was not performing up to the standards of normal progress, even though

every single student in the group, except for one, was scoring well above the

average. In this case, as with all skewed distributions, the median, which for

this distribution is 107.50, is a far more accurate indicator of true

centrality than was the mean. Point out that when the distribution is skewed

to the left, as in the above scores, the mean is going to severely

underestimate the true centrality.

Have the students graph the above distribution to again reinforce what the

shape of a skewed distribution looks like. If equal intervals are chosen for

the base line (abscissa), the graph will have to be extremely wide to fit all

the values in. Also, point out that the median remains at 107.50 whether the

low score had been 15 or 105. However if the low score were changed to 105 the

mean would then jump to 107.75. Changing that one score caused the mean to

gyrate, but the median remained rock steady at 107.5.

Activity: Averaging Averages?

Sometimes students will intuitively assume that to get the mean of two sets of

scores, all they have to do is average the two means. This of course is only

true if both sets of scores have equal numbers of cases. But show them that

with unequal numbers of cases, averaging the means can be a big mistake. For

example, the following distribution 16,12,12,11,10,9,9,7,4 adds up to 90, with

a mean of 10. A second distribution, 10,8,8,8,8,6 adds up to 48, with a mean

of 8. The mean of the two distributions combined is 9.20, not the average of

the two means which would have been 9. You can, however, teach students to do

this correctly without going back and adding all the scores. Show them that

since the mean, M, is = to X/N, then X = the M times N, or (M)(N). X for the

first distribution is then equal to (M)(N) or (10)(9) = 90. Similarly for the

second distribution, X = (M)(N) = (8)(6) = 48. They will quickly see that the mean of

both distributions combined can easily be found by adding the two Xs (48+90) =

138 and dividing by the total N of 15, to get 138/15 = 9.20. If they think this

is complicated and would rather just put all the scores together and add them

up, explain that with a large data base, the technique you're showing them is

far more efficient.

Activity: Evaluating Percentages

The same concerns also tend to show up when evaluating percentages. Too

often students want to average percentages, even when the totals in the various

percentage categories are not the same. Even faculty members have been known to

have difficulty accepting the fact that means and percentages cannot always

simply be averaged. On a Master's Comprehensive exam at a small eastern

college, the passing grade on the objective section of the test was determined

to be 80% correct. This section was composed of 300 multiple choice items,

covering seven different content areas, but the seven content areas were not

all composed of the same number of items. For example, the exam could have had

100 items devoted to Learning, 100 to Systems and Theories, and then 20 items

each in Cognitive Psychology, Psychological Assessment, WISC-Assessment, Statistical

Analysis and, finally, Learning Disabilities. A student could then have

received scores of 95% in Learning, 85% in Systems, 60% in Cognitive, 60% in

Assessment, 60% in WISC, 60% in Statistics and 60% in LD. The faculty group

challenged the fact that the student's overall score could have resulted in a

passing grade of 80%. The scoring breakdown was as follows, 95 out of a

hundred in Learning, 85 out of a hundred in Systems, and then 12 out of 20 for

the other five sections. This resulted in a total of 240 correct responses out

of 300 items, or 80% correct. Thus, the student could have passed the exam,

even though failing in 5 of the 7 sections.

Or cite the example of the baseball player who hit .300 in day games and

only .200 in night games. This player wondered why he was being sent down to

the minors when his day-night average was a seemingly adequate .250. The

problem was that the team had only played 5 day games but had already played

over a 100 night games, and for all 105 games his average was a mere .215. By

the way, this situation was seriously argued on a sports, call-in radio show.

Activity: The Mean and Adding a Constant

Explain that the effect on the mean of adding a constant to every value is

to simply change the mean by the amount of that constant. Thus, the new mean =

the old mean + the constant.

In the following set of scores:

Mean = 63/8 = 7.875

Now we will add the constant 10 to each of the previous values

X+10

143

Mean = 143/8 = 17.875 (or 7.875 plus the constant 10).

Activity: The Mean and Multiplying by a Constant

Show your students that multiplying by a constant has the effect of

changing the mean by a function of that constant, such that the new mean equals

the old mean times the constant. Using that first distribution shown above,

each value will be multiplied by the constant 10

(X)(10)

140

120

100

630

Mean = 630/8 = 78.750 (or 7.875 times the constant 10).

Activity: The Mean and Independent Measures

Let the students see what happens to the mean when two independent

measures are summed. When the mean is being found for the sum of two measures,

for example if you have two independent measures on each subject, and these

measures are added, then

X1 + X2 = X1+X2

15 + 11 = 26

14 + 10 = 24

12 + 9 = 21

11 + 5 = 16

10 + 5 = 15

9 + 4 = 13

7 + 3 = 10

2 + 1 = 3

80 48 128

m1 = (80/8 = 10.00) + m2= (48/8 = 6.00) = M for 128/8 = 16.00

Thus, the mean of the sums (16.00) is equal to the sum of the two means

(10.00+6.00).

B. Multiple Choice Items

2-1. When scores are arranged in order of magnitude, the researcher has formed a

a. histogram

b. measure of centrality

c. measure of dispersion

d. distribution

2-2. Traditionally, the researcher indicates frequency of occurrence on the

graph's

a. ordinate

b. abscissa

c. line of ascent

d. horizontal axis

2-3. When single points are used to designate the frequency of each score, the

points being connected by a series of straight lines, this is called a

a. frequency polygon

b. frequency rectangle

c. scatter plot

d. histogram

2-4. The mean, median, and mode are all measures of

a. dispersion

b. variability

c. central tendency

d. all of these

2-5. When a graph is constructed using a series of rectangles indicating the

frequency of occurrence for each score, it is called a

a. frequency polygon

b. frequency rectangle

c. scatter plot

d. histogram

2-6. The measurement which occurs most often in a distribution is called the

a. median

b. percentile

c. mean

d. mode

2-7. When a distribution is skewed, the researcher who is interested in central

tendency should use the

a. mean

b. median

c. mode

d. all of these are appropriate

2-8. When a distribution shows a large majority of very low scores and a few

very high scores, the distribution is said to be

a. skewed to the right

b. skewed to the left

c. skewed to the middle

d. bimodal

2-9. The influence of a few extreme scores in one direction is most pronounced

on the value of the

a. mean

b. median

c. mode

d. percentile

2-10. Using the mean to indicate centrality on a distribution of income scores

usually results in

a. a false image of poverty

b. an accurate portrayal of income

c. a false image of prosperity

d. income scores never lend themselves to centrality

2-11. When each score is listed in order of magnitude, together with the number

of individuals receiving each score, the researcher has set up

a. a unimodal distribution

b. a bimodal distribution

c. a skewed distribution

d. a frequency distribution

2-12. The abscissa is

a. the horizontal axis

b. the vertical axis

c. the connected points on a polygon

d. a measure of central tendency

2-13. On a frequency distribution, raw scores are plotted on the

a. abscissa

b. ordinate

c. vertical axis

d. all of these, depending on the size of the group being measured

2-14. When graphing data, it is traditional to make the length of the ordinate

equal to

a. the length of the abscissa

b. twice the length of the abscissa

c. three-quarters of the length of the abscissa

d. one-half of the length of the abscissa

2-15. With a frequency polygon, scores are always presented on

a. the X axis

b. the Y axis

c. the Z axis

d. the frequency polygon may never be used to represent scores

2-16. The more separate scores there are in a given distribution, the higher will

be the value of the

a. the mean

b. the median

c. the mode

d. none of these

2-17. The ordinate is identical to the

a. X axis

b. Y axis

c. mean

d. none of these

2-18. The so-called "wow" graph is always possible whenever

a. scores are presented on the X axis

b. the abscissa does not begin with zero

c. the base of the ordinate is not set at zero

d. two distributions are being presented simultaneously

2-19. Perhaps the most serious flaw in graphing data is due to

a. not placing frequencies on the abscissa

b. not placing raw scores on the ordinate

c. not placing the ordinate on the X axis

d. not setting the base of the ordinate at zero

2-20. The following are all measures of central tendency, except

a. the mean

b. the median

c. the range

d. the mode

2-21. The arithmetic average defines the

a. mean

b. median

c. sigma

d. mode

2-22. The point above which half the scores fall and below which half the scores

fall, defines the

a. mean

b. median

c. sigma

d. mode

2-23. The most frequently occurring score in the distribution defines the

a. mean

b. median

c. sigma

d. mode

2-24. The mean is not overly affected by extreme scores, unless

a. the extreme scores are all in one direction

b. the extreme scores are in both directions

c. the number of extreme scores is fewer than 5

d. all of these

2-25. The fact that the mean IQ of college seniors is higher than that of

freshmen is probably due to

a. the fact that going to college increases the IQ

b. the fact that there is a big IQ gain between the junior and senior years

c. an incorrect interpretation of the data

d. the fact that the lower IQ freshmen tend to drop out of college and,

therefore, never become seniors

2-26. Adding just one or two extreme scores to the high end of a distribution,

has a great effect on

a. the median, but not the mode

b. the mode, but not the mean

c. the mean, but not the median

d. none of these

2-27. Adding just one or two extreme scores to the low end of a distribution, has

a great effect on

a. the median but not the mode

b. the mode, but not the median

c. the mean, but not the median

d. none of these

2-28. When the majority of scores are at the high end of the distribution, but

there are a few extremely low scores, the distribution is

a. bimodal

b. multimodal

c. skewed left

d. skewed right

2-29. When the mean lies to the right of the median, the distribution is probably

a. bimodal

b. multimodal

c. skewed left

d. skewed right

2-30. When the median lies to the right of the mean, the distribution is probably

a. bimodal

b. multimodal

c. skewed left

d. skewed right

2-31. When a distribution is skewed to the right,

a. the mode will be to the left of the median

b. the mode will be to the right of the median

c. the mode will be to the right of the mean

d. the mode will always be identical to the mean

2-32 Percentages are based on a standardized denominator of

a. 100

2-33 In order to read a percentage

a. only the numerator of the percentage needs to be shown

b. the percentage is always shown in fraction form

c. the percentages shown are always in the form of inferential statistics

d. to establish a percentage for a specific event, the total number of events

need not be known

2-34 When comparing percentage rate increases with decreases, the same absolute difference yields

the same percentage difference

b. the percentage increase calculates out as larger than the decrease

c. the percentage decrease calculates out as larger than the increase

comparing percentage increases with decreases cannot be done

2-35 The FBI’s Uniform Crime Reports provide per capita data based on a rate per

a. 100,000

b. 50,000

c 25,000

d. one million

2-36 Bar charts are used instead of histograms when the data are

Continuous

b. Non-continuous

In the form of values that may fall at any point along an unseparated scale of points
None of these since bar charts and histograms are synonymous.

Questions 37 through 42 are based on the following:

In a certain community, the median per-family annual income is $80,000. The

Mean per-family income is $100,000, whereas the mode is $71,000.

2-37. the distribution of income scores is

A. skewed right

B. skewed left

C. skewed to the middle

D. not skewed

2-38. the most appropriate measure of central tendency in this distribution

Would yield a value of

A. $80,000

B. $100,000

C. $71,000

D. none of these values could yield a measure of central tendency

2-39. if a new family were to move into the community with an annual income of

$295,000, this would most affect

A. the mean

B. the median

C. the mode

D. all of these

2-40. the annual income achieved by most of the families is

A. $71,000

B. $80,000

C. $100,000

D. half way between the mean and the mode

2-41. The annual income which is surpassed by 50% of the families is

a. $80,000

b. $71,000

c. $100,000

d. cannot tell from these data

2-42. The annual income which is surpassed by 90% of the families is

a. $100,000

b. $71,000

c. $80,000

d. cannot tell from these data

2-43. Whenever a distribution is skewed left, the measure yielding the highest

numerical value is always the

a. mean

b. median

c. mode

d. percentile

2-44. When a skewed distribution tails off to the right, the distribution is

a. skewed right

b. skewed left

c. skewed to the center

d. not skewed at all

2-45. In a histogram, the mode is always located

a. under the shortest bar

b. under the tallest bar

c. under the last bar to the right

d. under the last bar to the left

2-46. A bimodal distribution often indicates

a. that there will be two means

b. that there will be two medians

c. that the mean, median and mode have the same value

d. that two separate sub-groups may have probably been measured

2-47. The most appropriate measure of central tendency in a bimodal distribution

is (are) the

a. mean

b. median

c. modes

d. ordinate

2-48. When a distribution has two separate and distinct medians, then

a. it is skewed right

b. it is skewed left

c. it is probably bimodal

d. a distribution can never have more than one median

2-49. With a fairly balanced distribution of (neither skewed nor

bimodal), the most appropriate measure of central tendency is the

a. mean

b. median

c. mode

d. none of these

C. True or False: For the following, indicate T (True) or F (False)

2-50. A skewed right distribution has the mean lower than the mode.

2-51. The median is always exactly half-way numerically between the highest

and lowest scores.

2-52. The most appropriate measure of central tendency in a skewed right

distribution is the median.

2-53. A positively skewed distribution is identical to a skewed right

distribution.

2-54. Other things being equal, the mean is the most stable measure when

the data form is skewed..

2-55. With a skewed left distribution, the median is always to the right

of the mean.

2-56. With a skewed left distribution, the mode is never to the left of

the mean.

2-57. All three measures of central tendency can be calculated when the

data are in interval form.

2-58. On a frequency distribution curve, frequency of occurrence is always

plotted on the abscissa.

2-59. One should expect a distribution of personal income measures to be

skewed to the right.

2-60. When the median is being calculated, it makes no difference whether

one starts counting from the bottom or the top of the distribution.

2-61. If a positively skewed and negatively skewed distribution were

combined, the resulting distribution would probably be bimodal.

D. For the following questions, calculate the values.

2-62. For the following set of scores, calculate the mean, median and mode:

11, 2, 3, 3, 7, 6.

2-63. For the following set of scores, calculate the mean, median and mode:

20, 8, 18, 10, 15, 10, 13, 11.