Curriculum Evaluation : a Case Study

Curriculum Evaluation : A Case Study

Hugo Dufort, Esma Aïmeur, Claude Frasson, Michel Lalonde

Université de Montréal

Département d'informatique et de recherche opérationnelle

2920, chemin de la Tour, Montréal, Québec, Canada H3C 3J7

{dufort, aimeur, frasson, lalonde}@iro.umontreal.ca

Abstract. In the field of Intelligent Tutoring Systems (ITS) the organisation of the knowledge to be taught (curriculum) plays an important role. Educational theories have been used to organise the information and tools have been developed to support it. These tools are very useful but not sufficient to edit a large curriculum. We need rules to help preventing incoherences, and a guideline for determining such rules. In this paper, we report on two experiments. The first one seeks of determining some rules which we shall use to improve an existing curriculum. The second experiment uses these rules during the construction of a new curriculum in order to prevent initial mistakes.

1.Introduction

Surprisingly, it is difficult to find in the ITS (Intelligent Tutoring Systems) literature in-depth discussions about the success or failure of ITS. It is even more difficult to find about the reasons (or the criterions) that motivate their outcomes [8],[5]. Although some studies have been on the use of ITS in a class [1] or in laboratories [7], we remark, that the motto in the literature seems to be: develop, and later evaluate.

Nevertheless, as we can see in most software engineering books, it has been proved for a long time that the cost for correcting an error committed while developing software increases linearly in time [15]. Keeping that in view, it would pay (or at least it would be logical) to implement software quality control techniques in the development cycle of ITS. This problem has been raised before; in the knowledge transfer process, it is possible to use automated evaluation techniques to make sure that the computer scientist does not divert from what the pedagogue stated [10]. But before going any further, we need to ask ourselves: what are we searching for, exactly? Is it possible to define what we mean by "quality in ITS"?

In this article, while focusing on the curriculum aspect of an ITS, we define three axes upon which a curriculum can be evaluated. We then pinpoint particular properties and restrictions of our sample curriculum that, when defined in a formal way, are useful to its evaluation. Using the results of the evaluation, we propose a construction methodology that permits a better quality control and we validate it by building a new curriculum from scratch.

2.A Generic Framework for Curriculum Quality Evaluation

In our search for a generic framework for curriculum quality evaluation, we faced a difficult question: are there some aspects that are present, and important, in any curriculum model? There are almost as many curriculum models as there are ITS, and each one uses a different instructional theory [11] for the knowledge representation. We chose to deal with this issue by using an indirect approach: instead of imposing strict guidelines, we defined the quality of a curriculum as: the conformity to the developer's original intentions. Even with this broad definition, though, we still need to classify these intentions.

In linguistics, and in natural language treatment [14], it is common to see a text analysed upon three axes: syntactic, semantic and pragmatic; in our opinion, it is possible to use them when analysing a curriculum. We classify the developer's initial intentions on three axes (figure 1a), which are: teaching goals, the respect of a pedagogical model and the data structure recognised by the ITS. Quality is measured in terms of conformity to each one of these axes.

Figure 1. Three aspects of curriculum quality

The three axes (or aspects) are defined as follows:

Data structure: this is the syntaxic aspect of a curriculum, which concerns strictly the internal and external representations of the data. Failing to conform to the data structure definition (in terms of data definitions, for instance) can affect coupling with other modules of the ITS. The initial definition will have an important influence on the implantation of the pedagogical model.
Pedagogical model: this is the semantic aspect of a curriculum. Constraints on data elements, and on their relations are defined at this level, according to a model such as Bloom's taxonomy of objectives [2] and Gagné's learning levels [3].
Teaching goals: this is the pragmatic aspect of a curriculum. This aspect concerns mainly the contents of the curriculum, hence the material to be teached. We leave the evaluation of teaching goals to researchers in the knowledge acquisition domain [4].

Obviously, errors in the data structure will affect the implementation of the pedagogical model, and similarly errors in the implementation of the pedagogical model will have an influence on the organisation of the material to be teached. The three axes can be seen as three levels of a hierarchy, too (figure 1b). The curriculum we have developed had the following three axes of intentions:

Data structure: the model CREAM-O [13] (see section 3).
Pedagogical model: Bloom's taxonomy of objectives (see section 4).
Teaching goals: an introduction to the MS Excel spreadsheet (see section 5).

3.Curriculum Model

The curriculum in an ITS is a structured representation of the material to be taught. Today, the curriculum is often seen as a dynamical structure, that should adapt itself to student needs, subject matter and pedagogical goals [9]. The CREAM model, structures this material into three networks: the capabilities network, the objectives network and the didactic resources network.

For each of the networks, the curriculum editor leaves choice of the pedagogical model free to the developer. This flexible tool proposes Bloom's taxonomy as the default choice, but other taxonomies can be added. In this article, we pay particular attention to the objectives network. These can belong to each of Bloom’s six cognitive categories. More specifically, an objective describes a set of behaviours that a learner must be able to demonstrate after a learning session [13].

The CREAM model proposes several categories of links between objectives: compulsory prerequisite, desirable prerequisite, pretext (weak relation) and constituting. In order to facilitate the analysis, we decided to simplify the theory and keep a general prerequisite relation ( O1is-prerequisite-to O2, if it must be completed before hoping to complete O2) and a constituting relation (O1is-composed-of O2,O3… if they are it's sub-objectives), giving the latter a higher importance.

CREAM does not propose a method for the construction nor any restrictions on the structure of the objective network. Such freedom is given to the developer in order to make the model flexible. The experiment presented in section 5 is an exploration of the effects of this freedom on the development of a curriculum and the consequences on the quality of the curriculum thus obtained.

4.Bloom’s Taxonomy of Objectives

Bloom [2] sought to construct a taxonomy of the cognitive domain. He had multiple pedagogical goals: identify behaviours that enhance teaching, help pedagogues to determine objectives, analyse situations in which cognitive categories are used. He separates the cognitive domain into six broad categories organised in a hierarchy (acquisition, comprehension, application, analysis, synthesis, evaluation). The objectives belonging to a category are based on and can incorporate the behaviours of the previous categories. We will show later in this article that this detail has a great impact when one constructs a network of objectives.

Bloom’s theory also describes for each category of objectives a list of sub-categories that permit an even more precise classification. Even though this part of the theory is defined for the curriculum model used, we have omitted it during the analysis. In this article we will content ourselves with examining the base categories.

5.Evaluation of a Curriculum

In this section, we present a curriculum based on the CREAM model which has been developed between February and May 1997. Systematic methods are then presented in order to evaluate the quality of the objectives network and to correct the detected errors. A fragment of the curriculum thus obtained is also presented.

5.1.Development Framework

In order to discover which were the most likely errors during the development of the curriculum we have worked with two high-school teachers to construct a course for the teaching of the Microsoft Excel spreadsheet. We have chosen Excel because it is a well-known tool and is often used by university students and it starts to be taught at high-school level. For four months the teacher used the curriculum editor to construct a capability network, an objective network and a pedagogical network that makes relation between capability and objective. We have used the objective network for our analysis since it was the most complete at the time of this study.

The curriculum produced by the teacher was not complete at the end of the given time period; it covered only certain aspects of the material. Some objectives were cut from the curriculum since they had not been yet linked to the other objectives (about 11% of the total objective network). This rate was higher in the other two networks and this impeded the analysis. We have regrouped the objectives into six important aspects (base tools, graphics, edition tools, format/attribute and database tools) for reasons that we will clarify in section 5.2. Except for the Edition tools aspect, the completion level was judged acceptable.

Figure 2 shows a fragment of the curriculum before the first step of the corrections. This fragment contains eight objectives, some of the application category and others of the acquisition category. We noticed that there are no is-composed-of relations in this fragment. In its original form the curriculum contained mostly prerequisite links.

Figure 2. Fragment of the initial curriculum

This fragment contains numerous errors that are not easy to identify at a first glance. It is easy to get confused by the high number of nodes and links, so the only way to manage to find errors is to partially rebuild it. In this case, it is possible to make as many errors as the first time, but not necessarily at the same place; this could cause even more incoherence. So, if we want to see things more clearly, we need to use new toolsor new rules. In the following sections, we will show some principles which permit the systematic detection of errors, the evaluation of the quality of the work and therefore, the validation of this curriculum.

5.2.Entry Points

The structure of the objective network seems to suggest that teaching any material must be done in a hierarchical manner, since some objectives are at the peak of the composition relationships. What are the characteristics of a final objective? Intuitively we can describe a final objective as an objective which:

Is not a component of another objective (since in this case the objective which is its superior will be seen later in the learning session and therefore closer to the peak of the multi-graph)
Is not a prerequisite to another objective (since in this case it would be assimilated before the objective for which it is a prerequisite)

If we examine a curriculum from the data structure point of view, it appears to be an oriented multi-graph. In this case, the objectives are points where one can begin an exploration. We have therefore called them “entry points”. In order to analyse this characteristic in a more formal manner we use first order logic to define the following rule:

x : (entryPoint(x)  (y : (is-prerequisite-to(x,y)  is-composed-of(y,x)))) / (1)

The identification of the entry points in the initial curriculum has allowed us to notice that the lack of formal development methods has been directly translated into a lack of coherence in the structure. We must ask ourselves of each entry point if it is really a final objective. For example, in Figure 2, the objectives masked and locked are identified by rule (1) as entry points but they should not be so. On the contrary, the objective attribute [ac] should probably be a entry point but a prerequisite link stops it from being so. Of course this process is somewhat subjective but it is a guide that makes the developers ask themselves questions. The developers must constantly check if the structure being constructed corresponds to what they are trying to express.

The objectives which have been retained as entry points in the curriculum base are: Base tools, Graphics, Edition tools, Format/Attribute, Formula, Database tools. Table 1 shows the extent of the changes made to the curriculum just after taking the entry points into account.

Table 1. Changes in the first phase.

Type of change / Quantity / Ratio of change (on 90)
Relation changed destination / 2 / 2.2%
Relation changed type / 9 / 10.0%
Relation was inverted / 0 / 0.0%
Relation was added / 17 / 18.9%
Relation was removed / 2 / 2.2%
Total / 30 / 33.3%

The addition of a relationship was the most frequent chance. This may be surprising at the first glance, but we have discovered that curriculum incoherence is usually caused by missing relationships. Of the 17 added relationships 14 were of the is-composed-of type. This is due to the fact that when the structure of the curriculum lacks coherence, it is easier to trace the prerequisite links than the composition links (this shows a lack of a global vision).

Often there existed relationships but they were not of the right type. It is not always easy to determine if a relationship is one of composition or of the prerequisite type. A difficult introspection is often needed. The introduction of entry points makes this choice easier but the difference remains quite subtle. The other changes (changed destination and removed) target more serious errors, often due to a lack of attention. For example, a prerequisite link may has been drawn in a way it short-circuits a composition link.

5.3.Semantic Restrictions

As we have seen previously, the objectives in Bloom’s taxonomy are organised in six levels and the competence specific to each level can necessitate competence at previous levels. One may ask the following question: does this fact influence the structure of the curriculum? We answer affirmatively. Let us observe the two types of relationships presented in the objective network:

is-composed-of: let O1..On, be sub-objectives of O in a composition relationship. These objectives should belong to a category lower than or equal to O’s since “The objectives belonging to a category are based on, and can incorporate the behaviours of the previous categories " [2]. We believe that it is not desirable that a sub-objective Oi belongs to a category higher than O’s. For example the relationship is-composed-of(cell [ap], width [ac]) is not desirable from the point of view of Bloom’s taxonomy.

is-prerequisite-to: here it is harder to determine the categories of the objectives in question. In general, an objective O1 prerequisite to O2 should be of a lower category than O2, however, it is not possible to affirm this only on Bloom’s definitions. For example, in specific cases we might find synthesis exercises that are necessary for the comprehension of another objective.

The following rules related to the semantic aspects of the curriculum were obtained:

(x,y) : (isComposedOf(x,y)  category(x)  category(y)) / (2)
(y,x) : (isPrerequisiteTo(y,x)  category(x)  category(y)) / (3)

It is important to keep in mind that the second restriction must generally be respected. Table 2 shows the modifications made in the curriculum by the application of these rules.

Table 2. Changes in the second phase

Type of change / Quantity / Ratio of change (on 105)
Relation changed destination / 2 / 1.9%
Relation changed type / 0 / 0.0%
Relation was inverted / 4 / 3.8%
Relation was added / 0 / 0.0%
Relation was removed / 2 / 1.9%
Total / 10 / 9.5%

At this point of the validation process, it is important to be cautious so that the modifications made do not invalidate those made during the first phase. We observe that despite the extent of the changes being less important here, the changes do affect near 10% of the links in the curriculum. The most frequent modification was the inversion of a link. In all cases the change concerned an objective of the application category which was placed prerequisite to an objective of the acquisition category. Most of these errors were detected during the prior step.

If we add the total number of corrections made to the curriculum during both phases we obtain 42.8%, which is much higher than what we expected (we expected an error rate of approximately 20%). Figure 3 shows the fragment of the curriculum after the two phases.

Figure 3. Curriculum fragment after the corrections.

In this curriculum fragment the main changes were additions: four is-composed-of relationships were added. These changes were justified by respecting rule (1) concerning the entry points and by the desire to complete the attribute definition. The two attribute objectives had their relationships changed in order to respect rule (3), and in order to make the attribute [ap] a entry point.

6.Lessons Learned

In order to understand why so many errors were found in the developed curriculum, we studied the method used by the high-school teachers. Several comments on the curriculum editor seem to suggest that a method should be proposed or even imposed on the developer: "The use of these tools tends to segment the teaching material to the point where one loses sight of the global picture" [6].

6.1.The High-School Teachers' Method

It seems that most of the errors were due to the fact that the tool allows the curriculum developers to work in an unstructured way. After having analysed the protocol used [12] we have determined that the teachers tended to use a bottom-up approach [4]. In order to build the curriculum, they enumerated the pertinent objectives, identified their types and regrouped them, traced the relationships, and repeated the process.

This method, named content-driven, favours the exhaustive identification of the objectives (it will cover more of the subject material). It could be useful to build a general course on a specific domain without knowing the characteristics of the learners. One of the biggest inconvenience of this approach is that some elements are defined and then discarded since they will not be connected to anything.

With this approach, there is a loss of both global vision and coherence (therefore of the global quality). The introduction of entry points addresses partially this lack. The semantic restrictions permits the detection of some errors that are less damaging but still important since they constitute 25% of the total number.