Proposal for the Evaluation of São Paulo School Employees Performance Pay Scheme*

Proposal for the Evaluation of São Paulo School Employees Performance Pay Reform

Barbara Bruns, Claudio Ferraz, and Marcos Rangel [*]

October 2008

BACKGROUND

Context for the reform in Brazil. Despite spending significant resources in education and increasing school attendance at all levels during the late 1990s and early 2000s, Brazil’s education performance is significantly lower than countries with similar income per capita. In the 2006 Program for International Student Assessment (PISA) test of math, science and literacy among 15 year old students, for example, Brazil ranked 54th among 57 countries in mathematics, scoring lower than Argentina, Indonesia, Mexico, Chile, Thailand and Uruguay. While reading performance was better, with Brazil ranking 49th out of 56 countries, Mexico, Thailand, Uruguay and Chile all scored significantly higher. These internationally benchmarked results as well as evidence from national and state-level learning assessments showing very low average proficiency levels have led to an overall agreement among policy makers that Brazil’s education challenge lies in improving the quality of public schools.

The large decentralization process that took place since 1988 and the introduction of several government programs aimed at increasing the amount of resources going to public schools have not improved the performance of students in test scores accordingly. Moreover, the distribution of test scores, even within small regions, shows a great disparity (controlling for student and family characteristics), suggesting that teacher quality and school management plays an important role (Menezes-Filho, 2007).

Teachers face weak incentives in Brazil. Salaries are relatively low and there are not incentives linked to performance as salaries are mostly determined by tenure (Holanda-Filho and Pessoa 2007). Low teacher motivation and large indices of absenteeism from the classroom directly affect students’ performance. About 30,000 teachers, or 12.8% of the total teaching force, are reported absent each day in the Sao Paulo state education system.

At the end of 2007, the São Paulo State Secretary of Education launched a program aimed at improving the quality of its 5,000 primary and secondary schools and 250,000 teachers. The program consists of several actions, among those the introduction of a new curriculum with clear guidelines on the material and competencies to be taught in each grade, a strong focus on universal literacy for young children and the introduction of supervisors to help directors and teachers improve school management. A central feature of the educational reform in São Paulo is the introduction of an innovative “teacher bonus” to link pay more closely to performance for all state school employees.

Context for research interest in evaluating this program. Despite the central relevance of teacher contracting and pay policies for education system performance, the evidence base on “what works” is weak. In both developing and developed countries, teacher pay is overwhelmingly based on educational attainment, training and experience, rather than performance. Yet variations in teacher performance, even within a single grade in the same school, are substantial (Rivkin et al 2001). Hanushek (2004) has estimated that the “good teacher effect” on student learning outcomes is roughly equivalent to the effect of a 50% decrease in average class size in the US – a much costlier reform. Studies also indicate a weak correlation between teachers’ actual effectiveness and the most common proxies for teacher quality, namely education and experience. Most of the evaluated experience with bonus or merit pay has been in the US. The early experience was not effective (Cohen and Murnane 1986), but these experiments may have been too limited in the magnitude of the reward and the character of the performance evaluation (Hanushek 1994).

The most carefully evaluated programs outside of the US are a cash bonus program for secondary school teachers in Israel (Lavy 2004), a program awarding prizes to teachers in grades 4-8 in rural Kenya (Glewwe, Ilias and Kremer, 2008), and a study in Andra Pradesh India that is currently in its second year. The Israeli results showed significant effects on student performance in the subject areas rewarded, which were attributed by the researchers to changes in teaching methods, after-school-teaching, and increased responsiveness to students’ needs. The researchers concluded that the cash bonuses for individual teachers were more cost-effective than alternative programs which offered cash bonuses for schools as a group or added instructional time to all schools. The Kenya study found relatively modest effects on student learning in the treatment schools, but these gains disappeared after a year. There was little evidence of teacher effort aimed at increasing long-run learning: teacher attendance did not improve, homework assignments did not increase, and pedagogy did not change. The only observed change was that teachers conducted more test preparation. The AP study (Muralidharan and Sundararaman, 2007) is a larger scale, longer-duration study, which after one year found a significant (.19 SD in math, .12 SD in language) impacts on student learning from both group-based (whole school rewarded for average learning gains) and individual incentives (teachers rewarded differentially based on the gains registered by their own class).

The proposed evaluation of Sao Paulo’s bonus program would be the first rigorous evaluation of such a reform in a middle-income developing country, in a program at scale. The study would have high marginal value as a complement to the existing research base in developing countries, which are of pilot programs in low-income settings, and produced somewhat inconsistent results. Our proposed study will evaluate how merit pay affects teachers’ effort, training uptake, skills and classroom practice, and student learning outcomes, and whether it promotes significant adverse behaviors (diverting curriculum time from non-tested subjects or manipulation of test results). Deeper understanding of these issues is needed for effective policy in this area.

THE INTERVENTION

The performance pay system designed for the São Paulo state schools is an annual bonus paid to schools based on how well they meet individual school level targets. Thus, school progress is measured and rewarded on a value-added basis, and implicitly takes into account schools’ differing socioeconomic contexts and specific educational challenges. The incentive is a strong one: for schools that meet 100% of their target, all employees will receive a bonus equivalent to three monthly salaries. For schools that do not meet their targets, the bonus will be paid proportionally to the percentage of the target met (i.e. schools that meet 50% of their target will receive a bonus equivalent to one and a half monthly wages).

The target is calculated for each school based on two sets of indicators. First, 70 percent of the target is calculated based on SARESP (Sao Paulo state annual achievement test) test scores and average school level promotion rates.[1] The remaining 30 percent is based on teacher attendance and school management indicators.

The SARESP is a standardized test applied annually to all schools in the state of São Paulo. All students in grades 1st, 2nd, 4th, 6th, 8th of primary school (ensino fundamental) and the 3rd (final) year of high school (ensino médio) are tested on their knowledge of Mathematics and Portuguese. The scale of the exams varies between 0 and 500. Instead of defining school level targets based on average scores, which could create incentives for schools to disregard students at the bottom of the learning distribution, the Secretary of Education decided to use information from the whole distribution of test scores. Four levels of proficiency were created in order to facilitate teacher interpretations of the scores: Below Basic (Not Meeting Learning Standards), Basic (Partially Meeting Learning Standards), Proficient (Meeting Learning Standards), and Advanced (Meeting Learning Standards with Distinction).[2] The cut-offs for each category are different for each grade. For Mathematics, for example, the 4th grade cut-offs are 175, 225 and 275. For 8th grade, they are 225, 300, and 350.

In order to aggregate the percentage of students that belong to each category into an index, the Secretary of Education assigned values that penalize the schools linearly for students that are below the Advanced category. The index assigns a penalty of 3,2,1 and 0 to students in each category (the value 3 is assigned to students Below Basic, the value 2 is assigned to students in Basic, and the value 1 is assigned to students in the Proficient level). The indicator of grade discrepancy is then calculated as:

This indicator is then converted into an index that varies between 0 and 10 using the following formula:

In addition to this index, the indicator used by the secretary of education takes into account the approval rates. For each school, the primary school years are divided into two groups: the first that varies from 1st to 4th grade and the second from 5th to 8th grade. The average time it takes for students to complete a grade (or group of grades) is then calculated using the fact that the sum of the inverse of approval rates for each grade provides an estimate of the average time that it takes for students to complete a grade. This flow measure of average time it takes to complete a grade, F, is normalized to vary between zero and one by dividing the number of years that should take to students to complete a group of grade by the actual time that it actually takes.

The performance indicator based on test scores I, is then combined with the flow indicator F to create a measure of school quality for São Paulo--the IDESP[3]:

Because F varies between 0 and 1, it penalizes the schools for taking longer than expected to complete a series of grades and thus creates incentives in the direction of automatic promotion. But because performance also depends on test scores, there is a countervailing incentive which penalizes schools if students do not learn adequately.

Using the IDESP as an indicator of quality for each school, the secretary of education used the same methodology as the IDEB, implemented by the Ministry of Education. They assume that all schools will converge in 2030 to a maximum grade that equals 9. A logistic function is then estimated and the predicted value for each school and year from 2008 to 2030 provides the target that the school has to attain in that specific year.

The bonus will be paid for all schools according to the percentage of the target achieved by the end of the year. All employees from the schools that meet the target will receive a 100% bonus (equivalent to three monthly salaries), while schools that do not meet the target will receive a bonus that is proportional to the percentage of the target that is attained (e.g. for a school that meets 80% of the target, all employees will receive a bonus of 0.8*3 monthly salaries).

Primary Research Questions and Outcome Indicators

This evaluation aims to answer 10 research questions:

1) Does linking teachers’ pay to indicators of school performance via a bonus result in improved student learning?

2) Does linking teachers’ pay to indicators of school performance via a bonus reduce teacher absence?

3) Does linking teachers’ pay to student test scores via a bonus result in positive behaviors such as increased teacher effort (hours worked, quantity of homework assigned and graded), more effective teaching strategies, or reassignments in school personnel in favor of tested grades and subjects?

4) Does linking teachers’ pay to student test scores via a bonus result in undesirable behaviors such as manipulation of test results or reduced class time spent on non-tested subjects?[4]

5) Does giving schools information about the rules of the game for the bonus significantly improve their chances of earning it?

6) Are schools that are unsuccessful in the first year of the bonus program more or less likely to put effort into competing for the bonus in subsequent years?

7) What strategies do schools use to try to improve performance under the bonus program?

8) To what extent do the levels of trust, teamwork and cooperation within schools explain their success in accessing the bonus?

9) How does success -- or lack of success -- in the first year of the bonus program affect levels of trust, teamwork and cooperation within schools?

10) What strategies do schools employ to build trust, teamwork and cooperation?

Outcome indicators will include: student test scores (SARESP), student enrollment, promotion and completion rates, teacher/school personnel absence rates. For all schools, the Secretary of Education collects rich socioeconomic and other background data on school directors, teachers, supervisors and students via an annual school survey, and the state also surveys parents via an online survey. São Paulo also has good budget data, which we will use to estimate changes in school-level spending and the cost-effectiveness of the reform in producing student learning improvements.

For a sample of schools, we will also try to deepen the analysis in three areas. First, we will collect data on teachers’ instructional strategies, use of time and classroom resources, through direct observation using a standardized classroom observation instrument. Second, we will collect qualitative data from directors, teachers, supervisors students and parents about perceptions of the bonus program and school-level changes both prior to and after the first round of bonus payments. Finally, through the application of an innovative set of new instruments, we will develop direct measures of the levels of trust, teamwork and social capital within schools.