British Ability Scales 3:Score Classifications & Descriptions

Descriptive Classifications for IQ and Other Composite Cognitive Scores

Score classifications that are used in various individually-administered tests have changed over time. Older classification systems tended to be value-laden, using such terms as “Mentally Deficient,” and “Superior.” The Wechsler/Binet systems, which became widely used, can be traced to the early 1900s. Pintner (1923, p. 77) and the Stanford-Binet (Terman & Merrill, 1937; Merrill, 1938) had the following classification scheme:

Pintner Stanford-Binet

ClassificationClassificationIntelligence Quotient

FeeblemindedMentally Defective0-69

BorderlineBorderline Defective70-79

BackwardLow Average80-89

NormalNormal or Average90-109

BrightHigh Average110-119

Very BrightSuperior120-129

Very SuperiorVery Superior130 and above

Interestingly, although the descriptive terms have changed over time, every subsequently-published major cognitive test battery has used the same score boundaries!

Classification labels are not in themselves objective statements, but are descriptors of an individual’s level of general cognitive abilities recommended by an author as useful in communicating with lay people.

The Wechsler scales used some of the same descriptors as those of Pintner and Merrill, but these also changed over the years. For example, the WISC-R (1974) and WISC-IV (2003) categories are as follows:

WISC-R WISC-IV

ClassificationClassificationIntelligence Quotient

Mentally DeficientExtremely Low0-69

BorderlineBorderline70-79

Below Average (Dull)Low Average80-89

AverageAverage90-109

Above Average (Bright)High Average110-119

SuperiorSuperior120-129

Very SuperiorVery Superior130 and above

The BAS3 Descriptive Classifications

In recent years, there has been a tendency among authors and publishers to move to a classification system that is less qualitative or evaluative, and more quantitative in its descriptions. It is also desirable to have parallel descriptors above and below the mean. Thus the British Ability Scales, ever since their first publication in 1979, and the Differential Ability Scales (1990, 2006) have used the following classification system:

BAS3 GCA or

Classificationother composite score Percentiles

Very Low69 and below1-2

Low70-793-8

Below Average80-899-24

Average90-10925-74

Above Average110-11975-90

High120-12991-97

Very High130 and above98-99

How Should Descriptors be Used?

In the BAS3 SRS software, the GCA and SNC composites, and the cluster scores are reported with confidence limits. The author recommends that if a score has confidence limits that are in two categories, both categories should be used in describing the child’s score. For example, if a child’s GCA score is 91, with 90% confidence limits of 86-97, it would be most appropriate to report the child’s score as being in the average to below average range.

A Note on PercentileRanges

The percentiles covered by the various standard scores are also shown in the table above. From this it will be seen that the central category (Average) has a very neat and interesting feature: the “Average” classification covers 50% of the population—it comprises all those individuals whose scores are in the range of 90 to 109, who lie between the 25th and 74th percentiles. Thus according to these classification boundaries (which have been, and continue to be, adopted by all major cognitive test authors and publishers) an average score is defined as one that would be obtained by someone lying in the mid-fifty-percent of the population.

A Final Note on the Definition of ‘Average’

Some educational psychology training courses teach that a region plus or minus one standard deviation from the mean should be categorised as ‘average’ (i.e. the middle 68%). It is important to bear in mind that this is merely a convention and that ‘average’ has only one real statistical meaning – a score at the mean itself. Even then, statisticians prefer to use ‘mean’ in order not to confuse it with the median or the mode.

There have been many different conventions used in psychometrics regarding the definition of ‘average’. For example, British Army and Navy recruitment policies used to use the middle 40% as their ‘average’ selection grade, a scale adopted by Alice Heim in her well-known AH series. If stanines 4 to 6 are used as ‘average’ this is the middle 54%. Increasingly, schools are using a three-point grading which splits performance into the bottom 25%, the middle 50% and the top 25%. While 68% may have a convenient statistical meaning, this does not make it an ideal convention to convey accurate meaning to non-statisticians – the middle 50% makes far more sense.

Whichever convention is used, it is critical that the meaning which the educational psychologist is attaching to ‘average’ is conveyed to anyone being presented with the results, or they may assume it to mean something else.

References

Elliott, C.D. (1990). Differential Ability Scales (DAS). San Antonio, TX: The Psychological Corporation.

Elliott, C.D. (2006). Differential Ability Scales, 2nd edition (DAS-II). San Antonio, TX: Harcourt Assessment.

Merrill, M.A. (1938). The significance of IQs on the Revised Stanford-Binet Scales. Journal of Educational Psychology, 29, 641-651.

Pintner, R. (1923). Intelligence testing. New York: Holt, Rinehart & Winston.

Terman, L.M. & Merrill, M.A. (1937). Measuring intelligence. London: Harrap.

Wechsler, D. (1974). Manual for the Wechsler Intelligence Scale for Children-Revised (WISC-R). New York: The Psychological Corporation.

Wechsler, D. (2003). Manual for the Wechsler Intelligence Scale for Children, 4th Edition (WISC-IV). San Antonio, TX: Harcourt Assessment.

1