Assessmentfor Learning

Assessmentfor Learning:

From Theory to Practice

Professor Michal Beller

Director-General RAMA

August, 2010

Foreword

Introduction

Three of the centralgoalsof the education system are: realization of the potential of every student (knowledge, creativity, and values); the narrowing of educational gaps; and the assuring of a safe study environment.

Assuming that there is wide-spread agreement with regard to these goals, a number of questions arise. How are we to know whether they have been achieved? How are parentsto know whether the education system has provided their children with the tools necessary to successfully function as active citizens in society? How areall partners in the educational process (teachers, principals, and functionaries at various levels of the education system) to know whether their roles have been performed satisfactorily? And whether students from different backgrounds have had their needs properly addressed? How are we to detect gaps and how are we to know whether they have been narrowed? How can the public be assured that the future generation of Israeli children has been properly prepared to face the challenges of the 21stCentury? How can the public be sure that the extensive resources made available to the education system are being used wisely? And how can it be proved to policymakers that the country should increase its investment in education, even at the expense of other important needs?

There is a need for professionally constructed, valid measurement and evaluation tools to monitor the system and the achievement of these goals. Assessment is a complex issue within the business sector, and even more so in the public sector. And again, in education, the implementation of measurement and assessment is that much more difficult: the learning process is incomparably complex, the diversity among students is immense, there are different pedagogical approaches to achieving educational goals, and often results are achieved only after years of investment.

In education there is no single answerfor all needs (one size fits all) and there is no magic method of measurement and assessment. Different pedagogical approaches require, accordingly, different ways of measurement and assessment. Educators should thus make use of a wide range of assessment proceduresthat highlight possible positive and negative directions, and should indicate what is more suitable for their students.

The National Authority for Measurement and Evaluation in Education (known by its Hebrew acronym, RAMA) was founded in 2006 to address the need forprofessional measurement, evaluation and assessment in the education system. The ideology at the heartof RAMA’s activities rests on two principles: a. Assessment for learning and b. the designingof a mix of professional solutions thatintegrate different components of measurement and evaluation (for additional information about RAMA, see

Large-scale tests in educational systems and their frequency

Over the last decades there have been national tests administered to a large number of students and educational institutions in various countries around the world, including in Israel. The significance and importance of these tests is growing, leaving a strong impact on all those who are involved in the learning and educational process.

Large-scale assessments and professional surveys are crucial instruments for monitoring and tracking student achievement in the education system, and for investigating the success of the system in imparting knowledge and values to all learners within the system. The analysis of assessment results may draw attention to the gaps that need to be addressed, and highlight forgotten areas in which it is desirable to invest even greater resources. The standardized surveys and assessments may also contribute to motivating learning, fostering accountability on the part of those who are in charge of instruction, and creating better alignment between instruction and educational policy, as expressed in ministry curricula and professional training.

However, along with the many benefits that may be reaped from these test systems, it has been acknowledged that over time there may also be negative consequences for the education system in general, and on the pedagogical procedures in school in particular. These negative consequences intensify as the tests become more central and important in the eyes of the system at all levels, and as the tests are perceived as “high-stakes” or fatefulby principals, teachers and students. The negative consequences have recently been documented in the research literature (for example, Campbell, 1979;Hamilton, 2008; Koretz, 2005;Koretz & Hamilton, 2006;Nichols & Berliner, 2007)and have also been previously published by RAMA.

The following are a number of negative consequences resulting from the improper implementation of wide-scale tests:

Diverting instructional resources from subjects that are not included in the tests, and from schools or grades that do not participate in them, to subjects included in the tests, with particular attention paid to the types of questions that are appear on the tests. This process harms learning and detracts from the importance of other subjects, and may also harmthe cohort of students that do not participate in the tests in a given year.
“Teaching to the test”, through test-oriented study. This type of study is often based on memorization and involves fewer ofthe higher-order thinking skills, which are critical for internalizing material and long-term mastery. Furthermore, this type of study may bore students and detract from their joy of learning.
In extreme cases, as a result of the pressure felt by some schools and their desire to raise achievement at any price, some may take illegitimate actions that harm test integrity. Even worse: these actions (keeping weak students away on the day of the test, attempts to obtain the test topics and questions for classroom practice, helping students during the test, etc.)may conveyundesirable messages tothe students, in contradiction to the goals of education.

Besides harming the quality of the pedagogical process, turning the tests into "high-risk" may also impair the validity of the test results and detract from the ability to infer conclusions from them that will serve the system and allow it to improve. Thus, improvement in test results achievedthrough massive and intensive test preparation in schools tested in a given year,does not necessarily indicate improvement in the education system as a whole, since it does not represent a rise in the level of knowledge of all students. This improved achievement, even if it appears to enhance the public image of the education system or parts of it, is in many ways only cosmetic, and is worthless to policymakers and decision makers who aspire to create real and sustainable change in the system as a whole. Only test results collected under “true conditions” and without special preparation can indicate the state of the system. Only in this way can decision makers at the different levels of hierarchy learn about the strengths and weaknesses of the system and act accordingly to improve it.

The education system, in cooperation with RAMA, should work to minimize these negative consequences, primarily through a cultural changeof “assessment for learning”, whereby assessment isintended to serve the learningand not vice versa.

In order for the system to improve it must ensure that its assessment instruments provide valid data to the extent possible, i.e., the test results should reflect a “true” snapshot of the system, rather than stemming from specific efforts designed only to raise test grades without them reflecting real improvement in the system and among all students. One should act to eliminate negative consequencesby sending the correct messages to the field, and by reducing the pressure and threat created byexternal test results being the sole evidence ofthe quality of pedagogical processes in schools.

Updating the format of measurement andevaluation: integrating external and internal assessment

An education system is perennially faced with adilemma inthe choice between independent, internal measurement by the school (partially free of those pressures described above,and more suited to students and the material studied in each educational institution) on the one hand, and external measurement which is standardized, professional and centralized, on the other. In other words, there is constant tension between the decentralization of measurement and evaluation and its centralization. There are those who maintain that in contrast to external assessment, internal assessment is less intrusive, more enlightened, and empowers the school principals and teaching staff. However, considerations of responsibility, accountability, transparency, professionalism, viability, and mainly the ability to make valid comparisons of schools (or of sectors, countries or other groups), including multi-year comparisons, require that partof the measurement be centralized and external, and carried out by a body responsible for educational measurement and assessment.

In order to integrate the two approaches (neither of which can alone meet all needs) while maintaining the benefitsof both, the format of Israel’s national assessment was ratified in 2007. The new format has been designed by RAMA in collaboration with the Ministry of Education, and in consultation with many school principals and teachers. This format is intended to provide a professional and suitable answerfor educational measurement and evaluation to all stakeholders in the education system, whether the schools themselves or related external bodies. Thisnew format was derived from the existing assessment system,in use prior to the change, after improving on it and addressing its shortcomings.

The new format is based on the following principles:

Implementation of a culture of “assessment for learning” –assessment that is intended to support the continual improvement of learning, through the alignment of learning goals with the school vision, based on the understanding that tests are not a goal in and of themselves, but rather an instrument for learning.
Mindful integration of internal and external assessment, and of formative (assessing during learning) and summative assessment.
Decentralization of assessment which is based on professional tools provided to schools by RAMA.
Empowerment of school principals and teachers.
Reduction of pressure from and frequencyof externaltests.
Preferring the use ofexternal tests to a sample of students over external tests encompassing all students.

The new format integratesthree elements: general and sample external assessment (international and national),independent school-based assessment based on standardized external tools, and internal, school-based assessment.

School-based external assessment

The new format of the Meitzav

The Meitzav – Hebrew acronym for “Growth and Efficiency Measures of Schools”, includes student achievementtests and questionnaires regarding school climate and pedagogical setting (administered to principals, teachers and students). At the level of the individual school, the system is designed to provide school principals and teaching staff with a tool for the planning and use of resources, for realizing student potential, for improving the pedagogical climate, and enhancing the school instructional system. At the level of the education system, the Meitzavis intended to provide a picture of the level of mastery of Israel’s students in the curricula of four core subjects, and to serve the professional bodies in the ministry and the decision makers in setting policy on various educational issues, including climate and pedagogical setting.

The Meitzav achievement tests focus on four core subjects: Mathematics, Native Language (Hebrew/Arabic), English, and Science Technology. These assessments are administered to students attwo grade levels: the fifth and the eighth. There is also a test on native language (Hebrew/Arabic) in second grade. The assessments are designed in alignment with the curricula and are aimed at examining the extent to which school students in elementary and junior-high schools achieve the expected level required of them according to these curricula. Examples of these tests can be found on the RAMA site, under the tab “Shelf Assessments”.

Each school belongs to one of four "Meitzav Clusters". The term "MeitzavClusters" refers to the division of Israel's elementary and junior-high schools into four equal and representative groups (Clusters A, B, C, and D). Each cluster of schools is selected in such a way as to be representative of all the schools in the country.

Within a cycle of four years -- schools are tested externally (External Meitzav) once in two years, each time in two subjects: Mathematics and Native Language (Hebrew/Arabic) or English and Science Technology. When not tested externally,schools are tested internally (Internal Meitzav) by self-administering the same test forms as those administered that particular year in theExternal Meitzav (for more on the Internal Meitzav see page 10). Thus each school is tested in each of these subjects once every four years by external assessment, and bythe Internal Meitzavin each of the other three years. Furthermore, the national norms derived from an examination of the external assessmentsofa particular yearserve as a benchmark for internal administration of the same assessments that year. The clusters rotate from year to year as shown in the diagram below:

Cluster of Schools / Subject / 2006/7 / 2007/8 / 2008/9 / 2009/10 / 2010/11 / 2011/12
A / Science & Tech / External / Internal / Internal / Internal / External / Internal
English / External / Internal / Internal / Internal / External / Internal
Math / Internal / Internal / External / Internal / Internal / Internal
Mother Language / Internal / Internal / External / Internal / Internal / Internal
B / Science & Tech / Internal / External / Internal / Internal / Internal / External
English / Internal / External / Internal / Internal / Internal / External
Math / Internal / Internal / Internal / External / Internal / Internal
Mother Language / Internal / Internal / Internal / External / Internal / Internal
C / Science & Tech / Internal / Internal / External / Internal / Internal / Internal
English / Internal / Internal / External / Internal / Internal / Internal
Math / External / Internal / Internal / Internal / External / Internal
Mother Language / External / Internal / Internal / Internal / External / Internal
D / Science & Tech / Internal / Internal / Internal / External / Internal / Internal
English / Internal / Internal / Internal / External / Internal / Internal
Math / Internal / External / Internal / Internal / Internal / External
Mother Language / Internal / External / Internal / Internal / Internal / External

Cycle 1 Cycle 2…

Thesurveys of school climate and pedagogical setting on the Meitzav aredesigned to provide a detailed picture ofthe school climate and pedagogical processes that occur within it, as depicted in the information gathered from questionnaires administered to students and from interviews held with teachers. The questionnaires provide comprehensive and relevant information on important dimensions in this area, including: the level of student motivation; the relationship between teachers and students; violent events and students’ feelings of safety; team work among faculty, and more. These dimensions are based on insights gathered from several sources: focus groups comprising teachers and principals, discussions with officials of the Ministry of Education, consultation with academic scholars and scholarlyreviews of current literature. The questionnaires are administered to fifth and ninth graders, and to elementary and junior-high teachers. The development of a survey of school climate for high school is presently underway. For more information please see the RAMA site under the tab “Meitzav – school climate”.

2010 is the fourth year in which the Meitzav has been administered in its new format. According to this format, every school participates once in two years in the ExternalMeitzav in two of the four subjects (Native language and Mathematics, or English and Science and Technology, alternately); the Internal Meitzav is administered in these subjects in the years when there is no external assessment(two in the year in which the school participates in the external tests, or four in the alternate year). The surveys of “school climate and pedagogical setting” are administered every year. TheInternalMeitzav tests are identical in content to the ExternalMeitav, but are administered and graded by school staff. The internal tests are accompanied by pedagogical materials designed to serve the teachers. Schools may use the grades from the InternalMeitzav as they see fit, as part of their students’ annual assessment. The process of grading the InternalMeitzav contributes to teachers’ professional development, as it provides exposure to professional rubrics that define the expectations from students and, allow for learning from the answers as to the extent of their knowledge and understanding. The grades of the InternalMeitzav serve the school staff only, schools are not required to report them to an external official.

What can be learned from the Meitzav?

The importance of the use of the Meitzav as a working instrument stems from the need to receive an updated, diagnostic picture of the situation (on the level of the student, the class, the school, and the whole education system) regarding the level of implementation of the different goals of the system and their attainment. This is crucial in order to maximize the potential of constant improvement of the school and the education system.

The level of the school – The process of gaining insights from the results that appear in the detailed school report allows the school staff to examine itself and to view the school as a holistic systemencompassing achievement, climate and pedagogical settings. An example of a school report can be found on the RAMA site under the tab “Meitzav.”

The findings allow the school staff to identify strong and weak points in the instruction of the subject being tested, to identify topics or skills that require further emphasis, to raise hypotheses regarding the results, to learn what additional data should be collected to confirm or reject hypotheses, to examine the reasons for difficulties that arose (for example, why students have difficulty in writing), and to design long-term plans for the school-based on evidence. The effective use of the Meitzav data can assist schools in formulating mechanisms for improving school-based processes and in planning long-term steps that sustain the change over time.