AN OBJECTIVE STUDY OF OREGONSTUDENT ACHIEVEMENT – ACHIEVEMENT GAPS – ACCOUNTABILITY – NCLB – COMPETITIVENESS INITIATIVE

Dr. Francis Charbonnier6/14/06

EXECUTIVE SUMMARY

The central goal of K-12 education is to ensure that every student master the cognitive skills and enhance the non-cognitive skills necessary to prepare for successful post-secondary and adult activities.

Schools have not devised a comprehensive set of assessments to measure student overall achievement. However, we have state assessments in core subjects (language arts, math, and science) which provide quantitative measures of a student’s performance as he progresses from grade 3 to grade 10 (and should be extended to every grade from grade 1 to grade 11).

I.Student Achievement

In this paper I analyze the Oregon assessment results (RIT scores) in reading and math at all grades tested (grade 3-8 and grade 10), first for all students, then for the major ethnic subgroups.

All student groups show a distribution of RIT scores which closely approximates a “normal distribution” characterized by the mean m and the standard deviation sd. We can also define the “range” as the difference between the RIT scores of students at the 5th and 95th percentile. In a normal distribution the range equals 3.3 sd.

A.All students: the range varies from 30 to 40 for all grades and both subjects. The performance standards are not aligned with actual student performance. This misalignment must be understood and reduced.

B.Ethnic subgroups: the ranges of the various distributions are essentially the same as for all students. The differences between subgroups, measured by comparing the means, are fairly small (1-7 points) and are essentially the same from grade 3 to grade 10. There is great overlap, e.g. top Latinos (95th percentile) score above 80% of Whites, while White low performers (5th percentile) score below 90% of Latinos. The RIT scores at grade 3 are ranked Asian American, Whites, American Indians, African Americans, and Latino. That order never changes from grade 3 to grade 10.

Plausible conclusions:

  1. Race is not a determinant of individual student achievement.
  1. Since the achievement gaps are present at the earliest grade tested, and remain unchanged to grade 10, it appears that achievement gaps stem primarily from differences in average socio-economic and cultural characteristics of the various subgroups. Schools do not create these achievement gaps, and they are mostly unable to close the gaps.

II.Accountability

We then consider accountability. The performance of an individual student is determined first by the student, second by his family, and third by the school. Praise for a high achieving student should be shared between the student, his family, and the school. Conversely, all three are accountable if the student underperforms, and all three must work on remediation, again with the student being the most important.

III.Domestic Achievement Gaps

  1. NCLB

NCLB is inspired by noble goals of social equity and justice. However, there are some serious problems with the implementation:

  1. The goal that all students be proficient by 2014 is utopian and unattainable.
  1. The formula for determining AYP deliberately ignores the progress of the majority of students, and considers only the percentage of failing students in a number of specified subgroups. Marginal failure in one of 30-60 tested areas results in the entire school being declared a failure and subjected to punitive and largely ineffective sanctions.
  1. Since the AYP formula requires the schools to be on a predetermined path to the unattainable 100% goal in 2014, as we get closer to that date the number of failing schools will steadily increase until the entire system collapses.
  1. Only the schools are accountable for closing achievement gaps that they did not create and are powerless to close without full support and commitment of students, families, and communities.
  1. Failing students in low performing subgroups trigger AYP determinations and corrective actions. Yet, in Oregon grade 10 math for example, 74% of all failing students are White and 16% are Latinos, while only 3-4% are American Indian, African American or Asian American. NCLB should be equally concerned with every failing student, regardless of color. Unless drastic changes are made, I fear disaster for NCLB and/or public schools as we approach 2014.
  1. Gender Gap

Recent studies show a growing gender gap, i.e. girls out perform boys both in K-12 and in university. The great performance of girls is very positive for our country’s future. But the relative underperformance of boys has serious social and economic consequences. We urgently need to address the problem and to develop school strategies which engage more boys and lead to higher performance and postsecondary goals.

IV.International Achievement Gap – Competitiveness Initiative (CI)

While all our attention was focused on failing students in underperforming subgroups, the achievement gap between American and foreign (European and Asian) high schoolers has been growing. For math and science this is well documented by TIMSS (grade 8 math and science in 45 countries) and PISA (15 year olds math in 26 countries). One goal of CI is to strengthen instruction in math and science at high schools. Progress on CI could be measured by the percentage of all students (and subgroups if desired) who exceed the performance standard and take AP or IB classes with exams. Considering the data for Oregon grade 10 students in math, we find that the percentage who do not meet the standard ranges from 41% for Asian Americans to 78-79% for African Americans and Latinos. Conversely, the percentage exceeding the standard ranges from 25% for Asians to 2-3% for African Americans and Latinos. When high achieving high schoolers complete university and become high performing adults and parents, they help their kids become high achievers. This may explain the persistence of racial achievement gaps (positive for Asians, negative for the others) in successive generations.

INTRODUCTION

The central goal of K-12 education is to ensure that every student completes high school, earns a diploma by demonstrating proficiency in a diversified and rigorous set of courses, and is prepared and qualified for successful post-secondary and adult activities. We also recognize that success and fulfillment in adult life requires not only academic proficiency but also good character, positive attitudes, values, strong lifelong goals, critical thinking and decision making, work ethic, social responsibility, respect for the law and citizenship, i.e. allegiance to a democratic and participative form of government.

We have developed curriculum content standards, state assessments, and performance standards at seven grades: grade 3 to grade 8 and grade 10. These assessments enable us to monitor and measure some elements of a student’s performance, progress and achievement as the student proceeds from kindergarten to high school graduation.

We have not devised a comprehensive array of assessments which evaluate all the characteristics listed above. This is a serious shortcoming as many studies suggest that character, values, strong goals and high personal standards are very strong determinants of ultimate success in adulthood. Our assessments measure only academic achievement.

The best measure of a graduating student’s academic achievement is his transcript which measures GPA, course intensity, areas of relative strength and weakness, and considers a broad array of subjects. The GPA has the great merit of being an associative measure of student performance, where weakness in some areas can be compensated by great strength in other areas.

Oregon’s state assessments give us a measure of student performance from grade 3 to grade 10. Initially, the Oregon Education Act planned to perform assessments in all major subjects, including social sciences, the arts, and foreign languages. Unfortunately, we gave up on that desirable goal and now perform state assessments only in reading, writing, math multiple choice, math problem solving, and science, which aligns state assessments with the requirements of NCLB.

Whereas the five assessments focus on essential foundation skills, we must always consider that these five tests measure only a part of a student’s academic achievement (10 out of 24 credits required for graduation), which itself is only a part of a student’s overall achievement and preparedness for post-secondary success. So we should refrain from sweeping conclusions on student achievement which are based on very limited tests of academic achievement. Consider for instance, that Van Gogh or even Einstein did not perform well in high school, yet their contributions to civilization were outstanding. In addition, we only have one assessment in high school, and students learn at different paces. If we compare two students, or two groups, and find a difference in achievement in one subject, there could be a much smaller or even opposite achievement gap in other academic or non-academic characteristics which are not measured.

My purpose here is to consider a key component of the Oregon state assessments, specifically RIT scores in reading and math (multiple choice), do an objective analysis of the data, and see what logical conclusions can be drawn, particularly relating to accountability. A discussion of achievement gaps for the subgroups specified by NCLB follows. Other achievement gaps deserve more study. For instance, the growing gender gap which we observe in K-12 (and also in college and adult life) should be a matter of national concern.

I believe that the achievement gap which most threatens America’s future competitiveness is the gap between American and foreign (Asian, European) teenagers. President Bush announced in his 2006 State of the Union Address a proposal to create the “Competitiveness Initiative” and funded it in his proposed 2006-07 budget. This will be discussed in the last section.

I.OREGON STUDENT ACHIEVEMENT BASED ON STATE ASSESSMENTS IN GRADES 3-10

A.All Students

We now measure annually the RIT scores of all Oregon students in grades 3-8 and 10. As expected, there is a wide distribution of individual scores. Statisticians have known for decades that, when you measure a variable (here RIT score) which depends on many uncorrelated factors, the score distribution will closely approximate the normal distribution (aka Gaussian distribution, or bell curve). This is the case for Oregon RIT score distributions. A normal distribution (shown in Figure 1) is fully characterized by two parameters: the mean m (average) and the standard deviation sd. As long as m is much greater than sd, the bell shaped curve is symmetrical and the mean and median are identical. In Figure 1 the variable x represents individual RIT scores. The function P(x) represents the relative number of students having a RIT score equal to x and the integral I(x) represents the percentage of all students who have RIT scores below x., e.g., 68% of students have RIT scores between m-sdandm+ sd, and 96% of students have RIT scores between m-2sd and m+2sd.

The Oregon Department of Education (ODE) gave me the mean and standard deviation for all Oregon students in 2005-06 for reading and formath multiple choice, shown in Table I.

Table 1: Mean m and standard deviation sd for RIT scores of all Oregon students, math and reading, in 2004-05

Gr 3 / Gr 4 / Gr 5 / Gr 6 / Gr 7 / Gr 8 / Gr 10 / Avg
Math / m / 210.86 / 216.89 / 223.45 / 226.16 / 231.47 / 234.61 / 236.29
sd / 9.90 / 9.94 / 10.71 / 11.39 / 10.65 / 12.22 / 10.78 / 10.8
100 sd/m / 4.7 / 4.6 / 4.8 / 5.0 / 4.6 / 5.2 / 4.6 / 4.8
Reading / m / 212.70 / 218.86 / 222.47 / 226.87 / 230.40 / 232.56 / 237.20
sd / 12.0 / 11.13 / 10.09 / 10.51 / 10.15 / 9.78 / 9.78 / 10.5
100 sd/m / 5.6 / 5.1 / 4.5 / 4.6 / 4.4 / 4.2 / 4.1 / 4.6

Based on the normal distribution (Figure 1) the students at the 5th percentile and the 25th percentile have RIT scores of m-1.65 sd and m-.065 sd. Students at the 75th and 95th percentiles have RIT scores of m+0.65sd and m+1.65sd. Figure 2 shows the results for all Oregon students in math, and Figure 3 for all students in reading. I have added the RIT scores required by ODE to just meet,or to (substantially) exceed,the performance standards. Since the actual test score distribution is approximately, but not exactly, a normal distribution, the calculated scores for the 5% - 95% students may not be exactly correct, but they are very close.

Figures 2 and 3 suggest the following observations:

  1. Test results are very similar for math and reading.
  1. The RIT score distributions are fairly wide and remain fairly constant from grade 3 to grade 10. If we use the difference between the RIT scores at the 95% and 5th percentiles, that difference (range) is between 32 and 39 for math at all grades. For reading, it decreases from a maximum of 38 at grade 3 to a minimum of 31 at grades 8 and 10. Possibly the distributions may be wide because “all students” include several subgroups with different achievement, (more later).
  1. In elementary school, 15-20% of students do not meet the standard, and more than 25% exceed the standard. Progress is well below expectation in the transition from elementary to middle school. Unfortunately, we have assessments only at grade 10 at the high school. Assessments at grades 9 and 11 are very desirable and would give us a much better picture of high school student performance. The biggest problem shown by Figures 2 and 3 is that progress from grade 7 to grade 10 is about 6 points, while our performance standards (meet or exceed) expect 13 points, twice as much, and this low progress is observed for all students (from 5th to 95th percentile, i.e. from low achieving to TAG).

As a result, at grade 10 the mean falls below the meet level in both reading and math, which is not acceptable. Now that NCLB dominates K-12 education, this is a very serious problem which causes all Oregon high schools to fail AYP, unless they are small enough to use the statistical exemption for all underperforming subgroups. Many people (wrongly in my view) interpret the steady decrease in mean minus meet from grade 3 to grade 10 as evidence that schools receive bright youngsters in grade 1 and fail to educate them. I personally believe that the educators who set the performance standards had unrealistic expectation of student progress in the mastery of core academic subjects. An urgent task for ODE is to re-examine our performance standards at grade 3 to 10 (or better 1 to 11) and to achieve a closer alignment with actual student performance. This alignment should not be achieved solely by lowering the performance standards at grades 8-11 (which probably would not be accepted by NCLB).

In the meantime the first priority of schools should be to raise student achievement at all percentile levels (which would not necessarily reduce the width of the RIT score distributions).

B.Ethnic Subgroups – Achievement Subgroups

Because the results are so similar for reading and math, the rest of this discussion will focus on math to avoid needless repetition.

Whereas the previous section considered all Oregon students, ODE has disaggregated the data on various subgroups and particularly the five major ethnic subgroups, shown in the Table II. In this study I use the better term “Latino” instead of “Hispanic.”

Table II – m/sd of Major Ethnic Subgroups in Math, 2004-05 School Year

Gr 3 / Gr 4 / Gr 5 / Gr 6 / Gr 7 / Gr 8 / Gr 10 / Ave sd
All students / 211/9.9 / 217/9.9 / 223.5/10.7 / 226/11.4 / 231.5/10.7 / 234.6/12.2 / 236.3/10.8 / 10.8
African American / 207/9.4 / 214/8.9 / 220/9.8 / 221/9.7 / 226/9.0 / 228.6/11.0 / 230.5/9.8 / 9.7
Latino / 206/9.4 / 212/8.6 / 218/9.4 / 220/9.2 / 225/9.1 / 228/10.6 / 230.3/9.4 / 9.4
American Indian / 209/8.8 / 214.5/9.2 / 220/9.6 / 223/9.5 / 228/9.1 / 230.5/10.9 / 233/10.1 / 9.6
Asian American / 214/11.0 / 220/11.3 / 228/12.1 / 231/13.0 / 236/12.0 / 240/13.1 / 240/11.7 / 12.0
White / 212/9.6 / 218/9.7 / 224.5/10.5 / 227/11.2 / 233/10.4 / 236/12.0 / 237/10.6 / 10.6

Every mathematician will tell you that, if you want to use a single number to represent and compare distributions, the best choice is the mean. Therefore Figure 4 compares the mean RIT scores for all subgroups at grades 3 to 10. Before a detailed discussion, consider also Figure 5 which compares the distributions for math at grade 5 and grade 10. Figure 4 shows that the achievement gaps, measured by the difference between the mean for all students and the means for the subgroups, are relative small (1 to 6.5 points) and are fairly constant from grade 3 to grade 10. Indeed, the mean RIT scores are ranked Asian American, White, American Indian, African American, and Latino in grade 3, and this order never changes from grade 3 to grade 10, even though at grade 10 the Latinos essentially catch up with the Blacks. If we had simple math and reading assessments for kids entering kindergarten or first grade, Figure 4 leads us to expect that similar achievement gaps and the same subgroup order would be found at grade 1. In support, I quote the study by Rothstein in his important book: “Class and Schools – Using Social, Economic, and Educational Reform to Close the Black-White Achievement Gap” which I believe is required reading for anyone wanting to form an informed objective opinion on this complex subject. Figure 6, from Rothstein shows that most of the gap in preparedness for black and white children entering kindergarten is not due to race but to differences in socio-economic characteristics. We must conclude that: 1) the achievement gaps of students entering school stem from the socio-economic and cultural characteristics of the student’s family and environment, and 2) that the persistence of these gaps throughout K-12 is due to the continuing impact of these socio-economic and cultural factors, so that schools can raise the overall achievement of all their students by good teachers and effective teaching strategies, but schools are not likely to reduce ethnic achievement gaps, or improve school climate (discipline gaps) without the active involvement and support of all parents and the entire community.

Figure 5 compares the subgroup distributions at grade 5 and grade 10 for the year 2004-05. Consider grade 5 first. The width of the distributions for subgroups, measured by score 95th– score 5th, is about the same for all subgroups as it is for all students. Hence, the wide distribution for all students is not due to the mixing of narrow subgroups distributions with different mean scores but is due to the wide score distribution present in every subgroup. These wide distributions mean that the average achievement gap (1 to 7) is very small compared to the internal score spread within each group (30 to 38). Consequently, there is considerable overlap between the subgroups. For instance, high achieving (95th) blacks or Latinos score higher than 80% of all students. Conversely, low achieving (5th) whites or Asians score lower than 92% of all students. We must also keep in mind that while factors like race, poverty, dysfunctional families, etc. have a measurable effect on averagesubgroups achievement, no factor or even combination of factors is an absolute predictor or determinant of individual student achievement. Similarly, average socio-economic or cultural characteristics for a subgroup do not determine the characteristics of individual families and their effect on student achievement. Study of grade 10 in Figure 5 leads to the same conclusion as grade 5, suggesting that these conclusions apply throughout K-12.