Measures Of Dispersion

While measures of average (mean, median, mode) give us an idea of what most of the data is like, or the values around which they cluster, measures of dispersion tell us how spread apart from each other they are.

Number of burgers eaten in one week by Year 12 boys

3 8 0 4 4 0 3 0 9 0 1 5 0

RANGE

Highest score – lowest score à

·  Not always very useful as it depends on only the 2 most extreme scores so may not be good representation of all the data.

·  For grouped data we can only estimate range – use either class interval values or class boundary values.

INTERQUARTILE RANGE

1)  Arrange data in ascending order

2)  Find median (middle)

3)  Find the middle of the lower half (Q1) and the middle of the top half (Q3)

4)  Calculate IQR = Q3 – Q1

(If there are 2 scores in the “middle”, add the and divide by 2)

VARIANCE and STANDARD DEVIATION

Standard deviation is more useful because it uses all data in the set. First the mean is calculated. Then the distance from each point to the mean is calculated. These values are squared, summed and then divided by how many numbers there are. The answer is the variance. The square root of the variance is the standard deviation. Fortunately you can use your calculator!

σ = standard deviation = √variance

In summary: standard deviation is the square root of the average of the squares of the distances from the mean.

Determine the standard deviation for the number of hot dogs eaten using your calculator.

Number of burgers eaten in one week by Year 12 boys

3 8 0 4 4 0 3 0 9 0 1 5 0

Write a few short notes in the space below to remind yourself how to do this on your calculator.

Chebyshev’s Theorem

Once you have calculated the standard deviation, you can use Chebyshev’s theorem to describe how many data points are within one, two or three standard deviations from them mean.

68% of the scores lie within x ± 1s

95% of scores lie within x ± 2s

99.7% of scores lie within x ± 3s

OUTLIERS

·  Values that are much higher or lower than the rest

·  Often due to recording error

·  Need further investigation: if error à delete

No error à retain, delete or explain depending on purpose of data

Spot the outlier:

3 5 1 9 3 0 58 3 0 1 4

Cropping Data

Choosing to leave out particular score(s)

Often done due to recording error OR can also be used every time (e.g. highest and lowest scores in a diving competition are always discarded).

Effects of Outliers On:

MEASURE / SCORES WITH / SCORES WITHOUT / EFFECT
3 3 5 6 6 48 / 3 3 5 6 6
Mean
Median
Mode
Range
Inter-quartile Range
Variance
Standard Deviation

1