Dutch TIMSS Results and the RME Curriculum

Dutch TIMSS Results and the RME Curriculum

Dutch TIMSS results and the RME curriculum

F.P. Vos & W.A.J.M Kuiper

University of Twente, Netherlands

Paper presented at ICME-9 (9th International Congress on Mathematics Education),

31 July – 6 August 2000, Tokyo, Japan

at the Topic Study Group “TIMSS and comparative studies in Mathematics Education”

Abstract

Discrepancies between TIMSS and the RME based curriculum in the Netherlands incited the question whether Dutch students would be able to display their learning in this international comparative research. Would TIMSS do justice to nations with a different curriculum? Yet Dutch students performed well on TIMSS. Their achievement seems to advertise the qualities of the RME curriculum.

Introduction

As a regular participant in the IEA (International Association for Evaluation of Educational Achievement) studies, the Netherlands joined the TIMSS-95 and TIMSS-99 projects (Third International Mathematics and Science Study). Apart from the international comparison at grade-8-level (approximately 14-year old students), these evaluation studies were considered of special national interest because of the synchronous curriculum reform at Dutch junior secondary schools. Besides national evaluation projects, TIMSS could offer another useful instrument to investigate whether the new curriculum was yielding satisfying results.

A dilemma was that the TIMSS achievement test did not fully match the new Dutch curriculum. The test items from TIMSS seemed to reflect the old curriculum that was abolished. Serious doubt arose whether Dutch students would be able to display their learning from the reformed curriculum through TIMSS. Yet, Dutch students performed well in TIMSS. How could this be explained?

The RME core-curriculum in the Netherlands for junior secondary schools

Three decades ago Hans Freudenthal and his colleagues started to transform the mathematics curriculum with a treatise which is generally known as “Realistic Mathematics Education” (RME). It is characterised by the understanding that mathematics is an integral part of real-life. Another component is the importance of enabling students to make mental images (Freudenthal, 1973, de Lange, 1987, van den Heuvel-Panhuizen, 1996).

After reforms based on RME in the primary and the senior secondary curricula in the nineteen-seventies and eighties, in 1993 the “W12-16-project” was carried out to establish a core-curriculum based on RME for all students at Dutch junior secondary school. Contents like “sets” and “2-d transformation geometry” were abandoned. Formal word problems used as a disguise for algebraic equations were minimised. The new curriculum emphasised data modelling and interpreting (through tables, graphs and word-formula), visual 3-d geometry, approximation and rules of thumb, the use of calculators and computers and other topics considered relevant to daily life of the new generation of the 21st century (Kok, Meeder, Wijers & van Dormolen, 1992).

National assessment was adjusted to the new content approach. As for the format of questioning, multiple choice items do not match RME, because the world of real-life hardly ever offers four ready-made alternatives from which to choose. Generally, test items in the RME core curriculum describe an appealing daily life situation (often with authentic photographs to enliven imagination) followed by questions that integrate different mathematical content areas.

The example in appendix 1 displays an exemplary RME test item for 14-year old students on the prediction of the growth of students, derived from their parents’ length (Kuiper, Bos & Plomp, 1997). It is an identifiable context, as most students at that age are still personally experiencing the biological growth process. The aspect of “horizontal mathematization” (Treffers, 1987) is clearly present with the realistic content being modelled into a formula and reversibly interpreted. Several integrated mathematics topics can be discerned (substituting values into a formula with three variables, fractions, order of operations, conversion of cm to m, co-ordinates on a graph). Students’ concentration on this one topic of “length” is expected to be kept alive for approximately 15 minutes which is short enough for students who aren’t interested in the topic (Dekker, 1993).

The TIMSS items matching a heterogeneous set of curricula

The TIMSS achievement test is not primarily an instrument to evaluate the achievement of Dutch students under RME only. TIMSS was developed at the TIMSS International Study Center to make an international comparison with the important question: what can we learn from other countries? Besides questionnaires, an achievement test was carefully constructed in a process that is well documented in the international TIMSS report (Garden & Orpwood, 1996). The practical aspects like reliability of coding and automatic data collection were considered, with more than 3,000 students in each country participating in the project, inducing 70% of all items to be of the multiple choice format. To minimalise coding discrepancies through national marking cultures, a strict coding scheme was developed and applied on the remaining open questions. As for the content, several experts in the field contributed in an attempt to cover the huge variety of national curricula. With 40 different nations participating, a cross section of contents and levels had to be found that would be equally unfair for all. Of course the question was not whether the TIMSS items covered all national curricula, but some agreement was needed on whether the curricula covered the items.

Within TIMSS a Test Curriculum Matching Analysis was carried out as special research aspect. In each participating country curriculum experts were asked to review each item and assess whether its content was covered by the intended curriculum for the majority of the target population. The result from this analysis is presented in Table 1. For each nation participating in TIMSS-95 (testing grade 8) it shows what percentage of the TIMSS mathematics items the national experts considered appropriate to the intended curriculum. The curriculum of many nations embraces the full set or most of the test items. Apart from Greece, it shows that large amounts of the test items are covered in most countries.

Table 1.

National subsets of items addressing a nation's mathematics curriculum

(from CD-ROM TIMSS International Database, 1999)

Nation / Percentage of TIMSS maths items, addressing national curriculum (n=157) / Nation / Percentage of TIMSS maths items, addressing national curriculum (n=157)
Hungary / 100 / Singapore / 90
United States / 100 / Ireland / 89
Latvia (LSS) / 99 / Romania / 88
Israel / 98 / France / 86
Spain / 98 / Belgium (Fl) / 86
Germany / 96 / Kuwait / 86
Lithuania / 96 / Belgium (Fr) / 85
Australia / 95 / Denmark / 84
Japan / 94 / Switzerland / 83
Slovak Rep / 94 / Iceland / 83
Portugal / 94 / Colombia / 82
Slovenia / 93 / England / 81
Hong Kong / 92 / South Africa / 80
Norway / 92 / Sweden / 78
Korea / 92 / Russian Fed. / 78
Czech Rep / 92 / Scotland / 76
Iran, Isl. Rep / 92 / Cyprus / 76
Canada / 91 / Bulgaria / 74
Austria / 90 / Netherlands / 71
New Zealand / 90 / Greece / 46

The table would have shown a larger number of countries with the highest percentage, if 7 items on “probability” had been replaced by other topics (4,4 % of the total). From the CD-ROM TIMSS International Database can be deducted that more than 20 of all participating countries indicated that those items were not included by their curriculum.

In figure 1 the TIMSS mathematics items are displayed as a compact box which is fully comprised within the intended curriculum of one nation (e.g. Hungary or USA), and only partly within the intended curriculum of another nation (e.g. Bulgaria, Netherlands, etc). It is assumed that in all cases countries include other content than the TIMSS items within their national curriculum as well.

Matching TIMSS with RME

In Table 1 the Netherlands lingers at the bottom, with 71% of the 157 TIMSS test items being more-or-less covered by the intended curiculum of the target population (grade 8). The complement of 29% of the 157 TIMSS items were considered too remote from the national curriculum. In (Kuiper, Bos, Plomp, 1999) this problem was already analysed, and in their research (with a slightly different set of 150 TIMSS-95 mathematics items considered) they have the percentage of 69% that are reasonably well covered by the Dutch RME curriculum. Four years later in 1999, when curriculum experts again were asked to asses the TIMSS mathematics items, the percentage was 71% (out of 155 items), showing that no real change had occurred in their judgement. The minimal discrepancies (69% or 71%) can well be associated with the differences in the assessed sets of items.

The ± 30% portion of items that were generally considered out of line from the national curriculum would have been more than twice as large if the experts had considered not only the content of the items, but also the question format (multiple choice). The present discrepancy between the TIMSS items and the Dutch national intended curriculum can largely be ascribed to the RME reform, and also to the topic of “probability” not being included yet for this age group.

Figure 2 illustrates the dichotomy of the TIMSS items when considering the match with the RME curriculum.

Appendix 2 displays five exemplary TIMSS-95 items of which Dutch curriculum experts indicated that they were not in line with the RME curriculum. Their objections regarded items that were pseudo-realistic and clearly constructed, as example N16 with a ludicrous girl giving away a portion of her marbles without knowledge of the initial amount. Two other examples have no obvious stimulus for the calculation other than the ordered request (inserting a value into a formula in N13 or calculating the ratio of perimeter to length in P08). A division is presented as an operation to be carried out correctly without a pocket calculator and without interpretation of the answer (J14). An algebraic rule (R10) on real numbers lacks functional instigation. Also in some cases the answer on an item can be reconstructed from the four given choices (the division in J14 can be reversed by a multiplication, four possible initial numbers of Jan's marbles are offered in N16).

Another aspect showing the difference to the RME-ideology is that the items have no coherence. Each item asks for isolated knowledge and students are estimated to work 2-3 minutes on each. Thus they hop with short concentration spans from one topic to another.

Dutch students’ performances on TIMSS

When TIMSS-95 was carried out, serious doubts arose whether TIMSS would do justice to the Dutch target population. A National Option Test for the grade 8 population was added which included both TIMSS items and nationally constructed items based on RME (Kuiper, Bos & Plomp, 1997 and Kuiper, Bos & Plomp, 2000). The analysis from this additional research established that the National Option Test and TIMSS constituted a single psychometric scale (students performing well on one test were to perform well on the other). Hence, although Dutch students were not fully prepared on the full set of TIMSS items by their curriculum, their abilities were well measured by TIMSS. In other words: the TIMSS items gave them enough room to display their abilities.

This analysis is supported by the fact that ± 70% of the TIMSS test items were contentwise covered by the RME curriculum, which should be a large enough portion. It was estimated that when teaching mathematics with real-life contexts and integrating topics, student would still be able to display their abilities on isolated questions and the multiple-choice format was not to obstruct their performance.

Moreover, although it might not officially be intentional, somehow Dutch students were knowledgeable about the remote items and could attain reasonable scores on these items as well. It could be, that teachers would still follow the abandoned curriculum or mix forthcoming content (e.g. “probability”) in their present teaching. Another reason could be that students acquired their knowledge outside the mathematics classrooms, or that they just attempted the unknown tasks with an open mind.

Overall Dutch students performed well on the TIMSS mathematics achievement test in 1995. As for 1999, six years after introduction of the new RME curriculum, the picture has not changed. To date, the international comparison is not yet available, but unweighted data from the national TIMSS centre show no significant change. With an average percentage correct of 63% on all mathematics items in 1995, we are proud to present a slight, though insignificant, increase to 65% in 1999. The new curriculum seems to have had a positive impact on the performances on the TIMSS achievement test.

Do Dutch students show a better performance on the ± 70% portion of the TIMSS items that were considered to match their curriculum? Looking at the data of the performances on the two complementary sets of items, there is no obvious difference with the overall performance (in 1995 on average 63% of Dutch students answered correctly on any item). The students performed just as proficient on the RME-matching items as on the set of items NOT covered by the curriculum. In table 2 this is summarized for 1995 and 1999.

In 1995 there is no discontinuity at all between the scores, while in 1999 there is a small (though not significant) gap between achievements on items that match and do not match the curriculum. In 1999 more students (68%) perform well on the test items that match the RME curriculum and less students (57%) perform well on the items that are remote from the RME curriculum..

Table 2.

Average percentage of correct scores by Dutch students in TIMSS-95 and TIMSS-99

TIMSS 1995
% (S.D.) correct
n=1921 / TIMSS 1999
% (S.D.) correct
n=2878
All TIMSS items
(n=150 in 1995, n=155 in 1999) / 63 (20) / 65 (19)
Non-RME items (± 30% of all) / 63 (20) / 57 (19)
RME-matching items (± 70% of all) / 63 (20) / 68 (18)

An explanation of the above figures is still to be established. Why would students in 1999 perform slightly better on items that match their curriculum, while in 1995 they do not show any different performance on this particular set of items?

Delay in implementation

A reason for the slight discrepancy in the columns of Table 2 could be that time was needed for the implementation of the new curriculum. 1993 was the year of introduction. Thus, in 1995, two years after introduction of the RME curriculum, there was still a period of transition from pre-RME curriculum to RME curriculum. Four years later, in 1999, the implementors of the new curriculum could have shifted their focus more towards the reform side.

Important carriers of curriculum change are the teachers. To investigate indicators on mathematics teachers’ familiarity with the new intended curriculum, in 1995 and 1999 the mathematics teachers of the tested classes at grade 8 in TIMSS were asked to skim through the item set, assessing whether the content of each item was taught by them (“if you would set a test for this class, would this item be suitable to be included?”).

In 1995 this was only examined on 16 items, that were selected by curriculum experts with the criterion of closely matching the RME curriculum (Kuiper, Bos & Plomp, 1999). These items were a subset of the 70% portion that matched the curriculum to a lower extend, but not too remote, and are indicated as “RME-matching items” in this paper. The three categories of test items, as judged by curriculum experts, are illustrated in Figure 3, with “closely matching” being a subset of “matching”, while “matching” and “not matching” are complementary.

In 1999 not just the 16 closely matching items, but all 157 TIMSS mathematics items were submitted for judgement by the teachers. Table 3 shows the results. Large numbers of teachers (83%) in 1999 indicated to include the items in a test, thus giving an indication on whether they had addressed the content of the TIMSS items in their lessons. Comparing the data in time, in 1999 there was a larger number of teachers who recognised the 16 typical RME-items as fitting their lessons (87% in 1995 and 92% in 1999). Yet it remains to be remarked that no statistical significant differences can be observed.

Table 3.

Average percentage of teachers indicating that an item was covered in their lessons

TIMSS 1995
% (S.D.) covered
n=91 / TIMSS 1999
% (S.D.) covered
n=112
All TIMSS items
(n=157 in 1995, n=155 in 1999) / n.a. / 83 (15)
Non-RME items (± 30% of all) / n.a. / 72 (18)
RME-matching items (± 70% of all) / n.a. / 87 (12)
16 selected items well matching RME / 87 (20) / 92 (19)

Table 3 reveals clearly that the teachers show a preference towards the set of RME-matching items (87% for the RME-matching items and 72% for the non-RME matching items). The closer the items match the RME curriculum, the more teachers indicate that they would include the items into a test. An average of 92% of all teachers indicated that they would include each of the 16 specially selected, closely RME-matching items in a test. This inclination shows that RME has been successfull in entering Dutch classrooms. The figures suggest, that this inclination has grown between 1995 and 1999. This could explain the small differences that occurred in the students’ performances on the different sets of items between 1995 and 1999 (cf. Table 2). The items, which do not match the RME-curriculum, receive less attention by the teachers, and consequently the performance of their students is less. On the other hand, with teachers giving more attention to the RME-matching items, the students show a remarkable improvement of their performances on these items (63% in 1995, 68% in 1999). Especially on these items a progress has been made between 1995 and 1999 which raised their overall achievement on TIMSS.

Conclusion

The new RME-based curriculum for junior secondary schools has found a sound foundation in Dutch schools. Six years after its introduction in 1993, almost all mathematics teachers indicate that test items that match the RME curriculum could be included in tests for their classes.

This new curriculum differs by its content approach from many curricula in other countries. While the TIMSS mathematics items are well covered by the intended curricula of many countries, curriculum experts in the Netherlands estimated that only 70% of the TIMSS items matched (content-wise, not format-wise) the RME curriculum, although the entire test appeared unfamiliar with its short questions asking for islolated knowledge and skills. Thus there was obvious ground for doubt whether Dutch students would be able to display their abilities in an international comparative research like TIMSS, as they might fail on ± 30% of the items.