Education: Equity and Excellence Through Testing- 1 -
Education: Equity and Excellence Through Testing
Betty Krygsheld
SeattlePacificUniversity
Compulsory education became law in each state of the union in the late 1800s through the early 1900s. It is a good and noble goal. This goal was stated most eloquently by Horace Mann when he declared education to be, “a great equalizer of the conditions of men,—the balance wheel of the social machinery” (1848, p. 668). Our nation has always struggled to define and reform education so that it would be the “great equalizer” that Mann envisioned it to be. I believe that standardized testing can be one important part of ensuring that all children have the knowledge they need to fulfill their purpose in life. Historically the struggle to educate our youth has always been linked to some form of accountability. This paper will first give an historical view of testing throughout U.S history and then analyze the appropriateness of testing to improve student learning.
Because education was thought to be a private matter and the responsibility of the parent in colonial times, equity in education was not of concern to the colonial citizens. Where schools did exist, the teacher was responsible for determining the standards, and oral testing (recitation) was used for ensure mastery of concepts. Testing and accountability under this local authority did not strive to make education the “great equalizer of men.” At its inception,the next era of schooling, the common school era, handled accountability for mastery of standardsby recitation, much like the colonial school. Oral recitation was highly biased by the teacher.
Significant change in school accountability came during the common school era. The State began to weigh in on both the setting of standards and the testing of those standards. The Massachusetts Board of Education as well as other Sate Boards established schools for teacher training, thus influencing the standards that the teachers set in their classrooms. Also at this time standardized written examinations were introduced to education. Mann, a proponent of the written examination, suggested that written examinations, as opposed to the oral recitation, were a better measurement of student learning, and less dependent on chance. In Mann’s words, “Hence examinations by written or printed questions are better than oral; for, in such case, the questions can be put to all; and a comparison of the different answers will be an impartial test of relative attainments” (1845, p. 510). For the first time written examinations were introduce to ensure equity in education, and significantly, accountability in education was monitored by the State.
Even in the 1800 the disparity in education between schools was of great concern. Mann writes about this disparity when he recounts the written examination results of the city of Boston. Thirty-eight percent of the test items were answered incorrectly. The schools were not teaching the standards that the Board and the school committees believed ought to be taught. The names of failing schools and students were published with the hopes that this would embarrass the schools and teachers to improve(Jackson, 2009). High stakes testing had begun. The use of these criterion types of written assessments did show concern for inequity in education and the desire of the State to remedy the problem.
By 1918,increased enrollment as a result of compulsory education and immigration combined with the industrial age focus on efficiency brought changes to schools in the area of accountability and testing. This emphasis on efficiencyushered in the use of standardized achievement tests on a large scale. In 1893 the first standardized achievement test was produced by Joseph Mayer Rice for the purpose of comparing progressive teaching methods to traditional methods. Rice’s goal was to use this testing to spur changes that would improve student learning.
By 1929 both the Stanford Achievement Test and the Iowa Program were developed. These achievement tests were norm reference tests. The results of these tests showed whether the student ranked as below, at, or above his peers in the wider community. This ‘age of efficiency’ had changed the purpose for testing. Results of the achievement tests were used for accurate educational placement of the student in a graded school. The tests then had a significant impact on the student. Testing also impacted teachers. Student results on achievement tests were used to determine whether the teachers were making an efficient use of the tax money used to support the school. Further, testing was now used to compare schools and districts to one another. Lacking the present research on student socioeconomic status and learning, school boards at that time thought the disparity between schools to be simply a matter of less than efficient use of public funds on the part of the teacher and school. Nevertheless the use of achievement tests did point out the disparity of education from one school to another just as it had in Mann’s time.
At the same time as the development of standardized achievement tests, the development of the intelligence test to assess a person’s mental capacity was developed. The emphasis in testing shifted. I.Q tests were used as a tool to sort students and place them in educational tracks. Those students who lacked aptitude were placed in industrial programs of education in high school. Those students who showed the most aptitude were placed in college preparation programs. These tests were indeed high stakes tests for the student.
Although I.Q. testing, when it was first proposed by Conant for use as college entry testing for Harvard, was meant to assure equity for the poorer student who had not had as rich a schooling experience as had the student from an affluent family, it had the opposite affect. I.Q. testsare highly dependent on student vocabulary. Students with a poor schooling background such as those who were schooled in poorly funded black schools prior to 1954 or students in schools affected by poverty, or students who were not native English speakers did poorly on I.Q. tests. Nevertheless, tracking in high schools was established on the basis of I.Q. and the ‘sorting’ of students often fell along racial and socioeconomic lines. Testing at this point was not being used to assure equality of education for all students. In his book, The Big Test, Nicholas Lemann posits that the SAT test, a college entrance test that relies heavily on I.Q., established an elite class in education (Lemann, 1999).
Since accountability for learning and teaching were not the focus, the practice of social promotion was practiced at this time. Students who had not mastered the material at each grade level were promoted with their peers. Later, in high school these low performing students would be tracked at a lower track suited to the industrial worker. The quest for using education as “balance wheel of the social machinery” was hindered by I.Q. testing and the disregard for achievement test data.
With the launching of the Soviet satellite Sputnik in 1957, the nation’s goal for education moved away from higher education for a few, as the 1940s and 1950s had envisioned, to bringing the level of education for all student to a point that would surpass the Soviet education system. In 1970 the National Assessment of Educational Progress, a norm referenced test, was developed by a branch of the Federal Department of Education. The goal was to compare the education that was taking place in each state’s educational systems. The movement toward testing in education began, but did not receive a real push until the book, A Nation at Risk, was published in 1983. A Nation at Risk suggested thatNAEP testing appeared to show not only a disparity in education between the states, but also that our students were functioning below the standards of other nations. School reform was called for. Debate continued and goals were set at both the state and national levels until in 2002 the No Child Left Behind Act was signed into law. NCLB required mandatory high stakes testing in all schools that received federal educational funds.
Historically the United States has moved toward more standardized testing. Beginning in the 1830 with printed examination questions until the present time, the emphasis has been on having an unbiased, fair, measure of student learning. At this time our country’s concern for testing focuses on using testing for assuring that our students have a base of knowledge, and that the transfer of this base of knowledge is done on an equitable basis without regard to race or socioeconomic status. But can standardized testing be used to improve student learning or assure equity?
In 2002 Martin Carnoy and Susan Loeb did a study on the question, “Does standardized testing improve student learning?” The focus of their study was math scores as measured by the National Assessment of Educational Progress. Carnoy and Loeb noted a significant positive relationship between strength of the state’s accountability system and the math achievement gains at eighth grade. These findings were adjusted for exclusion of special needs children. They concluded that states that implemented stronger accountability systems in 1990 saw gains in student performance on the National Assessment of Student Progress between the years of 1996-2000. Researchers E.A. Hanushek and M.E.Raymond (2005) as well as John Bishop (1998) found similar results.
One reasontesting appears to lead to greater student learning may be that the standards are laid out more clearly for the teacher and thus to the student. When the student understands the goal, he can move toward attainment. Bandera (Schunk, 2004 p127) posits that “goals enhance learning and performance through their effects on perceptions of progress….” Goals and the knowledge of their attainment could be improving student efficacy.
Carnoy and Loeb have a different idea of why they got the results they got. Carnoy and Loeb noted that in Texas the scores are up, and Texas increased educational funding and equal distribution of funds. Standard based testing, a form of criterion referenced testing, allows the teacher to assess the student’s progress in learning and remediate in the areas that are low. As Carnoy and Loeb pointed out, the test without the funding for remediation is not an affective way of improving student learning. Research suggests that teacher quality is a large factor in improving student learning (Ferguson and Ladd, 1996. Hanushek et al. 2005). Research also shows that teacher effectiveness is improved when teacher development is regular, driven by clear goals, and offered in the buildings in which the teachers work (Supovitz, and Christman. 2003, Education Trust 2005). If the focus is student improvementand not just data collection, funding is needed and teachers must be willing to learn and use new teaching methods.
When standardized tests are used incorrectly they do not improve learning. Diane Ravitch in an interview with Nancy Mitchell for Education News (2010) was quoted as saying,
High stakes testing will not improve student learning. Some of the ‘high stakes’ involved are the loss of funding for schools that do not meet the standard. Since the teacher, school, and district are pressured to achieve, the curriculum has been narrowed in many schools (Sacks, 2000). In an effort to improve test scores, some schools have removed subjects that are not tested, such as art, from the curriculum. Improvement in student learning will not happen unless the funds are available to hire more teachers to work with failing students and to pay for teacher retraining. Shaming districts and schools that perform poorly on tests will not improve student learning.
High stakes in some states include grade retention and failure to receive a high school diploma if the stakes are not met. There appears to be a link betweenlow achieving student dropout rate and high stakes testing (French, 2003). Student learning is not improved if a student drops out of school. The result of one test should not be used to make an important decision such as retention.
Research has shown ann achievement gap between students from varying socioeconomic groups and from different racial groups. Research conducted by David Orlich and Glenn Gifford (2006) showed a high correlation between poverty and the WASL standards based tests. Several other researchers have found the same correlation with other achievement tests. Orlich and Glenn postulate that children living in poverty have fewer social resources that would support and encourage them. The black achievement gap continues to confound researchers. What we do know is that it exists. We know it because of testing. The goal for educators has been made clearer through testing. We need research to find methods to bridge the achievement gap and teachers must be willing to use new methods to address learning in all groups of students.
What we can not do with the achievement gap is make the problem worse by attaching high stakes to standards based tests. If the stakes attached to testing are grade retention or lack of a diploma, the number of low income and black student who drop out of high school will increase faster than other groups of students. High stakes testing would appear to be responsible for creating an underclass. In 2010, the Seattle Public Schools has proposed giving pay increases to teachers whose students achieve higher scores on tests. This will most likely place more inexperienced and less qualified teachers in schools with high poverty or high ethnic populations. This will not improve student learning.
Presently many states are using criterion reference testing to satisfy the mandates set by No Child Left Behind. Those tests give each state a measure of the mastery of state standards by the student. This is an important part of excellence in education; however it does not give the nation any assurance that all states have developed standards that will provide excellence in education for all students. The augmented norm-referenced test seems to be a solution to this problem. It combines selected items from a norm-referenced test with items specifically written to assess state content standards. Both norm-referenced and criterion referenced scores can be obtained from these tests. This newly developed test provides the information needed to diagnose and improve education for each student in the state, but the test also may ensure that education throughout the fifty states remains equitable.
It would seem that testing is not the problem. The focus that has been placed on testing has helped to improve student learning. The problem lies in how we use the results from the test. As Diane Ravitch cautioned, the tests must be used for information and diagnostics. Where we see problems we must research and remediate, not sanction. Testing, when the tests are valid and reliable,is a tool that could move us closer to the balanced, functional society that Mann envisioned in 1845.
Reference
Bishop, J. (1998). The effect of curriculum-based external exit exam systems on student achievement.Journal of Economic Education, Spring 1998, p.171-182.
Carnoy, M. and Loeb, S. (2002) Does external accountability affect student outcomes: A cross-state analysis. Educational Evaluation and Policy Analysis. 24 (14). P. 305-331
Education Trust, 2005. Gaining traction, gaining ground. WashingtonDC.
Ferguson, R., and Ladd, H. (1996). Additional evidence on how and why money matters: A production function analysis of Alabama schools. In Helen Ladd, Holding schools accountable: Performance-based reform in education. Washington, DC: Brookings Institution Press.
French, D. (2003). A new vision of authentic assessment to overcome the flaws in high stakes testing. Middle School Journal, 35 (1).
Hanushek, E.A., Kain, J.F., Rivkin, S.G., 2005. Teachers, schools and academic achievement. Econometica, Econometric Society, 73(2), pages 417-458.
Jackson, G. (2009). Accounting for accountability, Phi Delta Kappan, 90 (9). Retrieved from:
Lemann, N. (1999). The big test: The Secret History of the American Meritocracy. New York: Farrar, Straus and Giroux.
Mann, H. (1868). The life and works of Horace Mann: Annual reports on Education. Boston: Horace B. Fuller.
Mitchell, N. (2010). Q & A with Diane Ravitch. Education News. May 1st, 2010 Retrieved from: .
Orlich, D. and Gifford G. (2006). Test scores, poverty and ethnicity: The new American Dilemma. Phi Delta Kappan, October, 2006. Retrieved from: testing_poverty_ethnicity.pdf
Sackes, P. (2000). Standardized minds: The high price of America’s testing culture and what we can do to change it. Cambridge, MA: Perseus Publishing.
Schunk, D. (2004) Learning theories: An educational perspective. New Jersey: Pearson Education, Inc.
Supovitz J., Christman J. (2003). Mapping a course for improved student learning. Philadelphia: Consortium for Policy Research in Education.