Classifying questions for sampling in a task-given-type examination

Toshihiko Takeuchi* and Akiyuki Sakuma*

Department of Industrial and Systems Engineering, College of Science and Engineering, Aoyama Gakuin University

Toshihiko Takeuchi

Postal address: Daizawa 4-45-15, Setagaya-ku, Tokyo, Japan

E-mail:

Phone & fax number: tel 81-3-3419-1248, fax 81-3-3419-1248

Akiyuki Sakuma

Postal address: Sasazuka 1-62-3-1202, Shibuya-ku, Tokyo, Japan

E-mail:

Phone & fax number: tel 81-3-5371-7085

Abstract To promote efficient science and technology education, we propose a task-given-type examination (TGTE). In this system, questions are given to the learners before the examination and questions in the examination are chosen from them.

One issue related to the TGTE is the method of choosing questions. As criteria of evaluation for sampling questions, we propose simplicity, predictability, and proportionality between effort and achievement. Classifying questions satisfies simplicity and predictability. In cases when the amount of effort needed for each question differs and in cases when there is relation between knowledge elements needed for each question, we propose a method of classifying questions satisfying the proportionality.

As a result of a case study in fluid mechanics, 182 questions could be divided into 16 classes satisfying the criterion to the effect that each of them require(s) nearly equal total(or additive) amount of effort for answering all the questions in each class, thus proving the system to be effective.

Keywords: task-given-type examination, TGTE, science and technology education, classifying questions, fluid mechanics

1. Introduction

1.1 Background

We propose a task-given-type examination (TGTE). This is an effective method to help improve achievement since it encourages learners to study by themselves.

A TGTE is carried out as follows: A group of questions in an examination related to the learning is given to the learners a sufficiently long time before the examination. Then, questions in the examination are chosen from the informed group. For example, in a term examination in college, 100 questions for the examination, hereinafter called candidates (for questions of the examination), are given to the learners on paper or on a Web site. Then, 20 of the 100 questions are randomly chosen and included in the term examination.

Giving the questions in advance enables many people to check the questions in a TGTE. Therefore, we can expect an improvement in the quality and objectivity of the questions in an examination and the exclusion of tricky or difficult questions. In addition, we need not worry about leaking of questions, so we can improve morals. Furthermore, dependence on chance in achievement in a TGTE is less than in achievement in a usual examination. Thus, earnest efforts are rewarded in a TGTE and this encourages learners to study by themselves.

Examples of carrying out a TGTE have not reported, but Takeuchi et al.(1)(2) presented theoretical studies into a system for a TGTE.

1.2 Previous studies related to sampling questions

One issue in preparing a TGTE is to sample questions for the examination from the candidates adequately. This paper discusses methods to sample questions for the examination from the candidates. There are previous studies regarding sampling adequate questions from a large number of questions. For example, selecting questions using item response theory,(3) a trial-and-error individual study aid system using fuzzy theory,(4) sampling using fuzzy theory,(5) and a study using Bayesian estimation to select questions(6) have been reported. In (3) and (4), for learners who have not studied enough, the system estimates an adequate question that they should review. Then, it gives them the question. In (5), the system sets the most adequate question according to the study record of the learners. In (6), the system selects a question that maximizes mutual information by Bayesian estimation engine. The above studies consider an examination after a lesson in (3) and (4), at each time of learning in daily individual lessons in (5), and at questions in (6). All of these studies strongly depend on data of the correctness or incorrectness of answers to questions by learners. However, since the learners can know the answers before a TGTE, understanding and the rate of correct answers do not have a strong relation. Therefore, it is inadequate to use the data of correctness or incorrectness.

A study for sampling questions for review(7) also exists. This study, however, assumes that data of acquisition on each question of the learners can be used. These data cannot be used in a TGTE either.

1.3 Purpose of this study

The purpose of this study is as follows. On sampling questions for a TGTE, we define three criteria of evaluation called simplicity, predictability, and proportionality between effort and achievement. As a method satisfying them, we propose the idea of "classifying questions" and apply it to real questions in order to verify the effectiveness of the method.

2. Sampling criteria and an outline of classifying questions

In this section, we will state criteria that should hold when questions for a TGTE are sampled from candidates and outline a method to divide questions into classes.

2.1 Conditions that should hold in sampling questions

When we sample questions for a TGTE, we need to sample them adequately so as not to discourage learners to study by themselves. Thus, we will propose the following three criteria in this study:

(Evaluation criteria 1) Simplicity: If the system for the examination is too complex, it will be difficult for learners to make a strategy to study and they will be discouraged to study by themselves. In addition, confusion will occur when we carry out and mark the examination, so it will be difficult to manage smoothly.

(Evaluating criteria 2) Predictability: Predictability is the accuracy with which a learner predicts his/her achievement. Concretely, when a learner is assumed to take the examinations many times, this is the standard deviation of the distribution of his/her achievement. If this value is small, learners can predict their achievement with high accuracy, so we consider that they are encouraged to study.

(Evaluating criteria 3) Proportionality between effort and achievement: We consider that, if achievement and effort of study are not proportional, then achievement will increase in the degree of study, so learners are encouraged to study by themselves.

There are other criteria needed for sampling adequate questions other than the above. For example, from structure among pieces of knowledge to study, Akahori calculated subordination, basics, and relation. He then proposed a method to calculate difficulty and usefulness of each element to study.(8)

In this study, however, we consider only the three criteria above for the following reason. In the simplest TGTE that we can consider, the most adequate method to set questions that we consider satisfies the conditions. Therefore, also in complex cases, we think that they are more essential than other conditions and we think that it is necessary to consider them.

We will explain the above consideration in a concrete example of a test of 100 irregular verbs. Before this test, a teacher presents 100 questions of irregular verbs and on the date of the examination, all of the 100 questions are set. By regarding an irregular verb as a knowledge element, there are 100 knowledge elements in the test. We can consider that the knowledge needed to correctly answer each question is independent of that required to answer other questions and that the effort needed to study each question is approximately the same. Therefore, this is the simplest TGTE.

Now, in this test, if we need not consider the time of the test or the load to answer or mark the test, it is the best to set all questions. The reasons are as follows. [1] The method to select questions is simple and there is no place for randomness. [2] Learners can perfectly predict their achievement before the test. [3] It is clear to learners that achievement and time of effort are proportional. These features are simplicity, predictability and proportionality between effort and achievement.

Therefore, in this study, we consider only the above three criteria.

2.2 Definition of classifying questions and its advantage

Classifying questions is a method that among N informed candidates for questions of the examination, if n (≦ N) of them are set, they are divided into r (≦n) disjoint classes and learners are informed that ni of the ith class will be set, where,

.

That is, the sum of questions selected from every class is n. An example of classifying questions is as follows. If there are 80 candidates and 10 of them are set, the candidates will be divided into 10 classes of eight questions, and one question from every class will be set.

Learners can also easily understand classifying questions since the method to set questions is simple, so this satisfies simplicity. In addition, it is certain that one or several questions in each class will be set, so the predictability strengthens.

For example, among 100 candidates Q1 - Q100, suppose that a learner studied 50 questions Q1 - Q50 and that he/she can answer the learned questions without fail and cannot answer the other questions. Let the perfect score be 100. Let marks be allotted equally. If 20 questions are randomly set without classifying questions among the 100 candidates, then the mean (expectation) of his/her achievement is 50 and the standard deviation is 100/√99 ≒ 10.05. We can derive these from the formulae of sampling without replacement. Classifying questions, however, lower the standard deviation without lowering the mean. For example, assume that the 100 candidates are divided into 20 classes of five continuing questions and that the learner is informed that one question in every class is set. Then his/her achievement is always 50, and the standard deviation is zero. Generally, classifying questions tends to lower the standard deviation to those who study in order. In other words, this strengthens predictability.

Note that for those who can answer questions at an equal rate for every class, we see that this weakens predictability by the formula of sampling without replacement. However, this is not important for our argument.

2.3 Proportionality between effort and achievement

Classifying questions does not always satisfy proportionality between effort and achievement. For example, consider cases satisfying the conditions in 2.2 and the following two conditions:

(1)  Elements needed to correctly answer questions Q1 - Q100 are mutually independent with respect to each question.

(2)  The effort needed to study to correctly answer questions Q1 - Q100 can be measured as, for example, the unit time of study, and the effort for each is equal to the question number.

Then, to study all questions of Class 1 (Q1 - Q5) requires only 15 units of effort, while Class 20 (Q96 - Q100) requires 490 units of effort. Therefore, to those who study Classes 1 - 20 in order, effort and achievement are not proportional. So classifying questions does not always satisfy proportionality between effort and achievement.

If relation among elements of study is simple, it is easy to classify questions so that proportionality between effort and achievement holds. For example, in 2.3, study needed for each candidate is assumed independent. Thus, to satisfy the condition appropriately by neglecting difference in each class, it is sufficient to distribute questions in approximate proportion to (the amount of effort to study each class)/(the total amount of effort to study all questions), or to allot marks similarly. In systematic subjects such as mathematics and physics, however, it is difficult to make classifying questions in which the amount of study of each class is equal.

3. Proposition of a method of classifying questions

In this section, we will propose a method of classifying questions that satisfies proportionality between effort and achievement, even if the simple method of classifying questions is inadequate owing to systematic knowledge.

3.1 Conditions assumed in order to apply this method of classifying questions

We will state the conditions assumed for knowledge in fields of study in which this method of classifying questions can be applied.

Condition 1: The items to study can be divided into finite knowledge elements.

Condition 2: Order can be specified to the knowledge elements to study as follows: Before study of knowledge i, it is necessary to study knowledge j and knowledge k.

Condition 3: There is no loop of order in knowledge elements to study.

Condition 4: The set of knowledge elements needed to solve each question is known.

Fields that may satisfy the conditions above are the fields in which knowledge is explicitly structured, such as mathematics, physics, and computer language. For example, S-PLUS, which is a computer language for statistical analysis, has many functions, such as the function seq to make an arithmetic sequence and the function rev to make a sequence of the reversed order. So acquiring them is the main content of study and each function can be considered a knowledge element. In addition, the order of studying functions is apparent: To study the function apply, which applies a function to rows or columns of a matrix, we need to study the function matrix to make a matrix in advance. Furthermore, the knowledge elements for a given question become clear by checking functions in a program that is the answer to the question.

3.2 Conditions to satisfy proportionality between effort and achievement by classifying questions

In this paper, we consider the following cases: One question is set from every class, and equal points F/n, where F is the perfect score and n is the number of questions that are set, is allotted to every question.

To satisfy proportionality between effort and achievement, we need to classify questions considering the following two conditions:

Condition 1: The additional amount of effort to study all questions in the class should be nearly equal with respect to every class.

As the questions in a class become difficult, the effort to study each class will become larger. However, when we study a difficult question, we have already acquired knowledge elements for easy questions, so we need only effort to get new knowledge (hereinafter called the additional amount of effort). Therefore, if the questions satisfy the requirement that we can study all questions in each class with nearly equal additional effort with respect to every class, then the achievement and time to study become proportional.