Office of Innovation and Improvement

(Attention: Department Priorities Comments),

U.S. Department of Education,

400 Maryland Avenue, SW. room 4W321,

Washington, DC 20202.

Regarding: DOCKET ID ED-OS-2010-0011

Secretary’s Priorities for Discretionary Grant Programs

To Whom It May Concern:

The Texas Classroom Teachers Association, representing 50,000 classroom teachers and instructional personnel statewide, has the following comments regarding the Secretary’s Proposed Priorities for Discretionary Grant Programs.

Regarding: “Statement of Proposed Priority 3. Projects that are designed to address one or more of the following priority areas:

(a) Increasing the number or percentage of effective and highly effective teachers or principals (as defined in this notice) or reducing the number or percentage of teachers or principals who are ineffective, particularly in high-poverty schools (as defined in this notice).”

Regarding “related proposed definitions:

Effective teacher means a teacher whose students achieve acceptable rates (e.g., at least one grade level in an academic year) of student growth (as defined in this notice). A method for determining if a teacher is effective must include multiple measures, and effectiveness must be evaluated, in significant part, on the basis of student growth (as defined in this notice). Supplemental measures may include, for example, multiple observation based assessments of teacher performance.

Highly effective teacher means a teacher whose students achieve high rates (e.g., one and one-half grade levels in an academic year) of student growth (as defined in this notice). A method of determining if a teacher is highly effective must include multiple measures, provided that teacher effectiveness is evaluated, in significant part, on the basis of student growth (as defined in this notice). Supplemental measures may include, for example, multiple observation-based assessments of teacher performance or evidence of leadership roles (which may include mentoring or leading professional learning communities) that increase the effectiveness of other teachers in the school or LEA.

Student achievement means--

(a) For tested grades and subjects: (1) a student’s score on the State’s assessments under the ESEA; and, as appropriate, (2) other measures of student learning, such as those described in paragraph (b) of this definition, provided they are rigorous and comparable across schools.

(b) For non-tested grades and subjects: alternative measures of student learning and performance, such as student scores on pre-tests and end-of-course tests; student performance on English language proficiency assessments; and other measures of student achievement that are rigorous and comparable across schools.

Student growth means the change in student achievement (as defined in this notice) for an individual student between two or more points in time. A State may also include other measures that are rigorous and comparable across classrooms.”

As has been consistently pointed out by ourselves and others, notably the National Academy of Sciences, the Department’s continued emphasis on assessing teacher effectiveness based in significant part on student achievement on standardized tests is deeply troubling. Now the latest research by the Department’s own National Center for Education Evaluation which provides more reason to question the accuracy and validity of measures that claim to ascertain the “value-added” by an individual teacher to his/her students by virtue of student performance on tests makes the Department’s latest attempt to push forward on this front in this proposed priority for discretionary grants simply untenable.

The report, Error Rates in Measuring Teacher and School Performance Based on Student Test Score Gains, released by NCEE just last month, found that “Using rigorous statistical methods and realistic performance measurement schemes, value-added estimates for teacher-level analyses are subject to a considerable degree of random error (25 -33% error rate for teacher-level analyses with three years of data) when based on the amount of data that are typically used in practice for estimation. This means that in a typical performance measurement system, more than 1 in 4 teachers who are truly average in performance will be erroneously identified for special treatment, and more than 1 in 4 teachers who differ from average performance by 3 months of student learning in math or 4 months in reading will be overlooked. In addition, … error rates will likely decrease by only about one half (from 26 to 12 percent) using 10 years of data.” The report further found that “Our results are largely driven by findings from the literature and new analyses that more than 90 percent of the variation in student gain scores is due to the variation in student-level factors that are not under the control of the teacher.”

The report goes on to state: “A performance measurement system at the school level will likely yield error rates that are about 5 to 10 percentage points lower than at the teacher level. This is because school-level mean gain scores can be estimated more precisely due to larger student sample sizes. Thus, current policy proposals to use value-added models for determining adequate yearly progress (AYP) and other school-level accountability ratings may hold promise from the perspective of statistical precision. An important caveat, however, is that biases may exist for estimating performance differences between schools, due, for instance, to nonrandom student sorting across schools.”

Additionally, as expressed to Secretary Duncan in a October 5, 2009, letter, the Board on Testing and Assessment of the National Research Council (BOTA) “has significant concerns that the Department’s proposal places too much emphasis on measures of growth in student achievement (1) that have not yet been adequately studied for the purposes of evaluating teachers and principals and (2) that face substantial practical barriers to being successfully deployed in an operational personnel system that is fair, reliable, and valid.”

BOTA’s letter elaborated further: “Although the idea has intuitive appeal, a great deal is unknown about the potential and the limitations of alternative statistical models for evaluating teachers’ value-added contributions to student learning. BOTA agrees with other experts who have urged the need for caution and for further research prior to any large-scale, high-stakes reliance on these approaches (e.g., Braun, 2005; McCaffrey and Lockwood, 2008; McCaffrey et al., 2003).” Additionally, BOTA stated that at their November 13-14, 2008 joint workshop with the National Academy of Education to obtain expert judgments and assessments of issues related to the use of value-added methodologies in education, “the considerable majority of experts at the workshop cautioned that although VAM approaches seem promising, particularly as an additional way to evaluate teachers, there is little scientific consensus about the many technical issues that have been raised about the techniques and their use.”

“Teachers are not assigned randomly to schools, and students are not assigned randomly to teachers. Without a way to account for important unobservable differences across students, VAM techniques fail to control fully for those differences and are therefore unable to provide objective comparisons between teachers who work with different populations. As a result, value-added scores that are attributed to a teacher or principal may be affected by other factors, such as student motivation and parental support.”

BOTA’s letter then set out a list of technical problems and important practical difficulties in using value-added measures in high-stakes programs to evaluate teachers and principals in a way that is fair, reliable and valid, concluding that “it is unlikely that any state at this time could make a proposal for using VAM approaches in an operational program for teacher or principal evaluation that adequately addresses all of these concerns…At present, the best use of VAM techniques is in closely studied pilot projects. Even in pilot projects, VAM estimates of teacher effectiveness should not be used as the sole or primary basis for making operational decisions because the extent to which the measures reflect the contribution of teachers themselves, rather than other factors, is not understood.”

Given the strength of these findings, the Department’s Proposed Priority 3 calling for increasing the number of effective and highly effective teachers and reducing the number of teachers who are ineffective, which is determined, in “significant part, on the basis of student growth”, which is defined as the change in “a student’s score on state standardized tests or other measures of student learning that are rigorous and comparable across schools in student achievement” is untenable, and it should be eliminated. At the very least, we certainly would not encourage making it an Absolute or even Competitive priority.

What is especially interesting about the Department’s Proposed Priority 3 is its juxtaposition to several other priorities proposed by the Department. Those are:

“Proposed Priority 11--Building Evidence of Effectiveness.

Background. The strongest available empirical evidence should inform decisions about education practices and policies. Evidence accumulates through evaluation of practices and of program performance and, as more robust evidence becomes available, increasingly rigorous evaluations become appropriate. Random assignment and quasi-experimental designs are considered the most rigorous evidence of the impact of a program because these designs are best able to eliminate plausible competing explanations for observed results. The Department’s notice of final priority on scientifically based evaluation methods, published on January 25, 2005 in the Federal Register,3 has made it possible for the Department to expand the number of programs and projects Department-wide that are evaluated

using experimental and quasi-experimental designs. This priority remains in effect; however, recognizing that using such research designs is not always feasible and that, in some cases, other designs are more appropriate to the question being asked, priority 11 would support rigorous evaluation studies consistent with the principles of scientific research in order to enable better understanding of the relationship between intervention, implementation, and student outcomes.

Statement of Proposed Priority 11. Projects that propose evaluation plans that are likely to produce valid

and reliable evidence in one or more of the following priority areas:

(a) Improving project design and implementation or designing more effective future projects to improve


(b) Identifying and improving practices, strategies, and policies that may contribute to improving outcomes.

Under this priority, at a minimum, the outcome of interest is to be measured multiple times before and after the treatment for project participants and, where feasible, for a comparison group of non-participants.

Proposed Priority 12--Supporting Programs, Practices, or Strategies for Which There is Strong or Moderate Evidence of Effectiveness.

Background. Using good evidence to inform decisionmaking and building better evidence over time are crucial components of continuous program improvement. This proposed priority is designed to support projects that use the best available evidence in designing and implementing programs and strategies.

Statement of Proposed Priority 12. Projects that are supported by strong or moderate evidence (as defined in this notice). A project that is supported by strong evidence (as defined in this notice) will receive more points than a project that is supported by moderate evidence (as defined in this notice).”

We strongly support these priorities, and in fact, suggest that they be identified as Absolute Priorities. But we find it hugely ironic that in one breath, the Department makes the statement that “The strongest available empirical evidence should inform decisions about education practices and policies.” and the need to “support projects designed to increase the number and percentage of effective and highly effective teachers” (or decrease the number of ineffective teachers) based on a methodology that has little, if any empirical evidence associated with it. This conflict should be resolved by eliminating Proposed Priority 3.

Thank you for this opportunity for input.

Holly Eaton

Director of Professional Development and Advocacy