Box-and-whisker plots are the main graphical representation of student assessment data used by the VCAA’s NAPLAN Data Service and VCE Data Service. They provide an effective way to summarise the main information regarding students’ results and provide a visual means by which the results of various cohorts (Class, School, State and National) can be compared.
In this tutorial, you will learn how student data (here, from a fictitious class of thirty Year 7 students) is used to create the essential features of a box-and-whisker plot.
Lots of data reports contain box-and-whisker plots, but what can they tell us about our students?
In this module, we’ll build one from scratch.
Meet some students from Example College, Victoria. These students have just received their NAPLAN results.
First, we’ll look at John’s results. (John is the boy holding the basketball)
Using just these results and your professional knowledge, what can you say about John’s ability in reading?
· Does the data indicate that John is a strong reader for Year 7?
· Does it suggest that John is a weak reader for Year 7?
· Does it show that he is better at Reading than Spelling?
· Does the data indicate that reading is John’s weakest area?
· Does it suggest that John is an OK reader for Year 7?
· Or is it that his data on its own doesn’t tell us very much?
While we can see John’s scores for the 5 NAPLAN tests, they don’t tell us much in isolation.
Although his Spelling result is the highest number, we cannot infer that he is better at Spelling than at say, Numeracy.
The tests are all marked according to different rubrics and scales, and thus John’s 5 results cannot be compared.
We need to see where John’s results fit into those of his class, and how they compare to state and national results, before any meaningful analysis can take place.
We begin by looking at the distribution of the reading results for the entire class.
This number line represents the scale of possible NAPLAN scores for Year 7.
Brendan is another member of this Year 7 class. His scaled score for reading is five hundred and thirty-seven.
Here we’ve placed Brendan on the scale based on his Reading score of 537.
Four other class members are Samuel, Michelle, Siti and Ava.
Their Reading results are four hundred and thirty-eight, four hundred and eighty-nine, five hundred and seventy-four and five hundred and thirty, respectively.
Now these students are also placed on the scale.
Here’s the entire class of 30 students.
John is about in the middle of his class, who are spread from Band 5 (National Minimum Standard for Year 7) to Band 9.
But how is his class going? To learn more, we need to compare his class to the state. The reports in the NAPLAN Data Service show this comparison using box-and-whisker plots.
We will look informally at how the box-and-whisker plot is constructed from this data.
Above the students we’ll use three short vertical lines to divide the class into quarters.
Half of the class is to the left of this divider, and half to the right.
Now the class is partitioned into quarters with these further dividers at the 25th and 75th percentiles.
Now we’ve placed a red box from the 25th to the 75th percentile. In this way, we have indicated the middle 50% of the class.
We can see that even though the median has divided the rectangle into two unequal parts, each part represents the same number (25% of 30) of student results.
In a NAPLAN box-and-whisker plot, the top and bottom 10% of the class is not shown.
In this class, there are 30 students.
As 10% of this class of 30 students is 3, the 3 highest and 3 lowest scores will not be represented by the graph.
The whiskers in a NAPLAN box-and-whisker plot extend out from the sides of the box to the 10th and 90th percentiles.
Notice that these whiskers now each cover an additional 15% of the class (10th to 25th percentile and 75th to 90th percentile) but that the box-and-whisker plot does not tell you where the highest 10% and lowest 10% of the students’ results are located.
The images of the students now disappear, leaving only the box and-whisker plot. The box and-whisker plots in the NAPLAN Data Service reports also display no data values. However, it’s always good to remember the individual students’ data that the graph represents.
What does the box-and-whisker plot tell us?
What doesn’t it tell us?
1) The box-and-whisker plot does convey a sense of the diversity of reading skills for the bulk of the students in this class.
2) It also tells us the location of the middle value, or 50th percentile
3) Furthermore, the plot tells us where the middle 50% of the values lie – within the box, between the 25th and 75th percentiles
4) Finally, the box-and-whisker plot also tells us where the lowest graphed result (10th percentile) and the highest graphed result (90th percentile) are positioned – at the ends of the whiskers
The box-and-whisker plot does not tell us:
· The location of individual results within any particular range
· About any subgroups of the student data which may be clustered or evenly spread within any particular range
· The average, or mean, result
· The precise locations of the most extreme results
NAPLAN box-and-whisker plots are usually presented vertically like this.
Note that the same information is conveyed.
Here we see side-by-side plots for National, State and School data. Notice the alternative scale on the left which indicates the bands within which the results lie.
NAPLAN graphs often include State and National box-and-whisker plots to allow comparisons to be made between cohorts. National results are shown in blue, State are shown in yellow, and the School results are indicated by the red graph.
This concludes Tutorial 2.1.