Michigan Department of Education

Growth Model Pilot Application

for Adequate yearly Progress determinations under the No Child Left behind Act

Submitted to the U. S. Department of Education

May, 2008

Introduction

Michigan’s assessment system includes the following components:

  • The general assessment for grades 3-8 is the Michigan Educational Assessment Program (MEAP);
  • The high school general assessment, administered to all students in grade 11, is the Michigan Merit Examination (MME);
  • The alternate assessments for students with disabilities, named MI-Access, include several assessments which are called Functional Independence, Supported Independence and Participation. These alternate assessments are for students with mild, moderate, and severe cognitive impairment, respectively. Students that use alternate assessments are students with the most significant cognitive disabilities.

Michigan has a long history of support for the assessment program and for instructional use of assessment data. The MEAP and MI-Access were expanded to grades 3-8 beginning in 2005-06. With assessment data for the 2007-08 school year, Michigan now has assessment data for three school years at adjacent grades in both English language arts and in mathematics.

In addition, Michigan has a student data system, called the Single Record Student Database (SRSD), which has been used for allocation of State School Aid, and for all pupil accounting and student data reporting since 2002-03. Michigan’s system uses a Unique Identification Code (UIC) to track student enrollment between SRSD submissions. The UIC tracks students independent of the student’s name. The use of UICs supported by a rigorous and highly accurate UIC resolution process ensure a high degree of reliability in matching scores between assessment administrations. These are important qualities because Michigan’s proposed growth model includes matched data on the general assessment (MEAP) and on the alternate assessments, the MI-Access Functional Independence (FI), Supported Independence (SI), and Participation (P) assessments.

The high school level assessments are excluded from Michigan’s proposed growth model.as the rationale behind this decision is that the high school measurement (at grade 11) is too far removed from the previous measurement occasion (grade 8) to provide useful growth data. Michigan’s proposed growth model also does not include data from the alternate assessments for students with severe and moderate cognitive impairment.Those assessments employ considerably different psychometric and scaling methods. As a result, they do not have the degree of precision present in the other assessments included in the growth model proposal.

Michigan’s Rationale for Using a Growth Model

Michigan has developed a growth model for reporting student achievement in grades 3-8 to be used for its state school accreditation program. Michigan proposes to adapt the state school accreditation model for use in determining whether schools and school districts are making adequate yearly progress (AYP) under the No Child Left Behind Act of 2001 (NCLB). Use of the proposed model will begin with the 2007-08 school year and be carried forward. If adopted, the proposed growth model will add to the current status and safe harbor system that is used under Section 1111 of the Elementary and Secondary Education Act, as amended by NCLB. The Michigan Department of Education (MDE) is prepared to cooperate fully with the United States Department of Education (USED) in evaluating the growth model.

Michigan educators have expressed frustration with the assessment information that forms the foundation of the AYP decision because the current AYP decision is based on assessment data that classifies a student as proficient or not proficient at a single point in time (i.e. classification based on status). Teachers often work with low-scoring students and make improvements in the achievement of individual students, but despite considerable gains, those students may not make it all the way to proficient. Unfortunately, status models alone do not allow student improvement, which may be attributable to teacher intervention, to be tracked in the current system. Michigan’s Growth Model would give credit in the AYP decision for growth from year-to-year by demonstrating that improvement in the student’s achievement is on a trajectory such that the student is expected to attain proficiency within the next three years.

Michigan is in a unique position to participate in the U.S. Department of Education’s Growth Model Pilot because:

  • Michigan meets the US Department of Education's New Equation for NCLB Flexibility;
  • Michigan has all the essential elements in place to implement a Growth Model for the 2007-08 school year; and
  • Michigan is already reporting growth data (currently referred to as performance level change) and is prepared to use growth data for AYP determinations for school year 2007-08.

The baseline for Michigan’s Growth Model is the 2006-07 school year. Michigan reported preliminary growth data to schools in 2006-07, using 2005-06 as a baseline. This initial reporting was critical to help Michigan educators to understand the nature of the growth data, and to develop standards for reporting growth data. Additionally, those data informed the decisions made in the formulation of the proposed model.

Match Rates

Michigan’s Single Record Student Database forms the foundation of the growth model. Because of Michigan’s racial/ethnic and socioeconomic diversity, it is important that the matching of student scores be high enough for all measured AYP subgroups. The match rates presented in the following tables represent only student scores that have matched data meeting the following criteria:

  • The student had valid scores in the same content area in both fall 2005 and fall 2006;
  • The student had valid scores in the same assessment, MEAP or the same MI-Access assessment (FI, SI, or P), for both years;
  • The student’s scores were based on assessments administered in grade x in fall 2005 and in grade x + 1 in fall 2006.

Some student scores are therefore excluded from growth reporting if they fall into any one of the following categories:

  • The student moved into or out of Michigan public schools between the fall 2005 and 2006 tests
  • The student missed school during the testing window in either 2005 or 2006
  • The student did not have a score in 2005 or 2006 for some other reason
  • The student took one assessment (e.g. MEAP, MI-Access FI, MI-Access SI, or MI-Access P) one year and a different assessment the next
  • The student was retained in grade
  • The student was promoted more than one grade

In addition, Michigan’s growth model proposal would exclude any students from the growth model portion of the AYP calculations if they are excluded from the achievement status calculations based on Full Academic Year (FAY) status.

Data for student score match rates is presented in Table 1 based on matches meeting the above criteria from fall 2005 to fall 2006.

Table 1. Student Score Match Rates.

Michigan Growth Model Score Matching
Fall 2005 to Fall 2006
Student
Group / English Language Arts / Mathematics
Number Tested / Number Matched / Percent
Matched / Number Tested / Number Matched / Percent
Matched
All Students / 629,998 / 573,796 / 92.9% / 628,274 / 580,654 / 93.8%
American Indian / 6,135 / 5,404 / 89.4% / 6,104 / 5,423 / 90.1%
Asian-American / 15,507 / 13,282 / 86.5% / 15,551 / 13,620 / 88.5%
Black / 128,715 / 109,289 / 86.6% / 129,463 / 110,513 / 87.2%
Hispanic / 28,281 / 24,354 / 87.3% / 28,347 / 24,748 / 88.5%
White / 445,071 / 418,220 / 95.2% / 445,523 / 419,847 / 95.4%
Limited English Proficient / 20,302 / 16,501 / 82.7% / 20,089 / 17,780 / 89.1%
Students With Disabilities / 85,975 / 71,787 / 92.3% / 86,335 / 72,214 / 92.7%
Economically Disadvantaged / 235,897 / 217,907 / 94.1% / 236,715 / 215,143 / 94.8%

The data presented in Table 1 show very high levels of matching in both content areas. The match rates are slightly higher in mathematics than in English language arts, but the difference between the content areas in less than one percentage point. Also of note, is the relatively high match rates obtained in the critical AYP subgroups. The following observations can be made in analyzing the match rates:

  • The overall match rate is above 92% in both English language arts and in mathematics;
  • The match rates for American Indian, Asian-American, Black, and Hispanic student groups is lower than the overall match rate;
  • The lowest match rate among racial-ethnic groups is 86.5% among Asian-Americans in English language arts;
  • The match rates for students with disabilities and for economically disadvantaged students are above 92% in both content areas; and
  • The match rates for limited English proficient students are lower in English language arts (ELA).

Michigan is investigating the lower match rates for limited English proficient students. This may be due to student mobility across state and national lines for migrant students or otherwise mobile students. It may also be attributable to the flexibility allowed for LEP students not to be assessed in ELA during their first year in the country. These students are excused from the general ELA assessment if they have taken an approved English Language Proficiency Assessment.

Michigan has developed a reporting system that provides schools and school districts with access to growth data on several levels of assessment reports, including individual student reports, parent reports, school and district summary reports, and on student data files. The school and district assessment administrators have access to both baseline year and current year data for all students. The Michigan Department of Education has alerted local school personnel to review reports to ensure appropriate score matching from year-to-year. Despite the robust nature of the student tracking system, there will be a very small number of student scores that are matched incorrectly, or where a match that should have taken place was not made. MDE is prepared to investigate these situations, and to correct data as necessary. With these procedural safeguards in place, the Michigan Department of Education believes that the high match rates demonstrate a high degree of accuracy in the student tracking system.

Michigan’s Growth Model

Michigan’s system is a comprehensive model of alignment that provides a foundation for meaningful reporting of students’ academic progress over time. The model includes both horizontal and vertical alignment as integral parts of the development of content standards, test blueprints, items, item pools, instruments, performance level descriptors, and performance standards. It also explains why comprehensively integrating alignment into development processes results in procedural efficiencies and gains in validity evidence for the measurement of student progress. Cautions against reducing focus on content in favor of alignment are given. All parts of the comprehensive alignment model have been or soon will be implemented in Michigan’s assessment program. The processes used for comprehensive alignment in Michigan are described fully in Martineau, Paek, Keene, & Hirsch (2007), a copy of which is attached to this proposal. In addition, the full set of considerations Michigan deliberated in designing its proposed growth model are presented in Attachment A. A summary of those goals is given here:

1.Implement a system that is capable of capturing significant differences in student progress while at the same time minimizing the effect of measurement error on the evaluation of student progress.

2.Implement a system that sets rigorous expectations for student progress that can be met and that, if met, result in identifying students that have demonstrated enough growth to designate their score changes as on track to reach proficiency within the next three years.

3.Integrate MEAP (general assessment) and MI-Access (alternate assessment) scores into a single system that reports growth for all students.

Very briefly, the Growth Model was developed by dividing each of the MEAP performance levels (not proficient, partially proficient, proficient, and advanced) into three sub-levels (low, middle, and high), and tracking students transitions from one year to the next (e.g. from the middle of the not proficient category in grade 3 to the top of the partially proficient category the next year in grade 4). The tracking mechanism is called a transition value table. A parallel task was carried out for MI-Access. Because the MI-Access Functional Independence assessment is shorter (and therefore has less precision), it is divided into fewer performance levels and sub-levels. The top and bottom performance levels (emerging and surpassed) were each divided into three sub-levels (low, mid, and high), while the narrower middle performance level (attained) was divided into only two sub-levels (low and high).

Because the MI-Access Supported Independence and Participation assessments are still shorter, the performance levels on those assessments are not further subdivided. In addition, because there is only one non-proficient category (Emerging) on these assessments, any move upward would qualify a student as both proficient in the second year, and as on trajectory toward proficiency.

The performance levels were divided into sub-levels in the following manner. First, the conditional standard errors of measurement were graphed for each subject and grade level. For example, for grade 3 mathematics, figure 1 was produced.

Figure 1. Conditional Standard Error of Measurement (SCEM).

The cut scores were then superimposed on the graph, identifying the scale score range of each performance level as shown in Figure 2.

Figure 2. CSEM with Performance Level Cut Scores Superimposed.

Each of the four performance levels was further divided into two (MI-Access) or three (MEAP and MI-Access) sub-levels as demonstrated in Figure 3, in conjunction with a check for the appropriateness of the width as described below. These two simultaneous procedures are presented separately here for clarity.

Figure 3. Performance Levels Divided into Sub-Levels.

The check for appropriateness of the widths was conducted as follows: the widths of these sub-levels were superimposed on the graph to confirm that the sub-levels were as wide or wider than the standard error of measurement across the sub-levels. This was done to ensure that the sub-levels were small enough to assure that significant student growth within a performance level could be captured, but large enough that movement across categories is not likely to be attributable to measurement error. In addition, the consideration of measurement error in the creation of sub-levels, as well as the extra wide sub-levels on the extremes minimizes the effect that regression to the mean can have on the transitions identified for individual students. Figure 4 demonstrates this check.

Figure 4. Check for Sufficient Sub-Level Width.

In Figure 4, the height of the horizontal bars indicates the width of the sub-levels, and in every case, the width of the sub-levels is greater than the conditional standard error of measurement at any location in the sub-level. This check was performed for all grades and subjects in both MEAP and MI-Access. There were a very small number of sub-levels where the conditional standard error of measurement was slightly larger than the width of the sub-levels. These minor deviations were tolerated in order to assure that the system remains transparent by having the same number of sub-performance levels in every grade.

In other states where this type of activity has been carried out, it was decided beforehand that there would be only one evaluation table that would be the same for all grades and subjects within the regular assessment. Michigan investigated having different tables for each grade and subject, and found that stakeholder panelists were unable to identify content-based or policy-based reasons to have different tables by grade or subject; and recommended that a single table be used for all grades and subjects. Therefore, there is only one progress evaluation table for MEAP (for all grades and subjects), and a parallel table to evaluate progress for the MI-Access Functional Independence assessment.

In addition, a value table (such as that submitted by Delaware) was originally intended to be a part of Michigan’s proposed growth model. However, after discussion with stakeholder panelists and with educational organizations from around the State, it became clear that the value judgments necessary to create a growth model had already been carried out in the original standard setting for achievement. The cut scores were set based on Performance Level Descriptors (PLDs), the PLDs were vertically articulated for content, and the cut scores themselves were also vertically articulated. This foundation of vertical articulation makes clear the ultimate target (proficiency now or at some defined point in the future). Leveraging the existing vertically articulated standards allows for growth targets to be developed analytically based on achievement standards rather than being determined subjectively in another standard setting exercise for determining growth targets. This analytical approach is explained in further detail below.

As a foundation for the analytical identification of growth targets, a descriptive transition table was created that contained no additional value judgments. The tables instead describe the transitions individual students make with respect to increasing expectations for student achievement across grades. In other words, the transition tables indicate whether students are declining over time relative to increasing expectations, whether they are exhibiting no change over time relative to increasing expectations, or whether they are improving their standing over time relative to increasing expectations. Students’ change in performance level is classified into five categories (significant decline, decline, no change, improvement, significant improvement) with accompanying abbreviations (SD, D, N, I, SI, respectively).

These transitions could have been labeled evaluatively. For example, the five categories could be labeled Excellent, Good, Minimally Acceptable, Fair, and Poor. By classifying the transitions descriptively instead of evaluatively, the original value judgments made in achievement standard setting are explicitly honored, and an additional layer of complexity is removed from the model.