Program Evaluation

A Methodological Primer for Program Administrators

By

George R. Reinhart, Ph.D.

National Research Council

The National Academies

So, you have been tasked to conduct an evaluation of your program and you don’t want the person to whom you report to tell you to go to hell. Well, I’m here to give you some pointers that will tame the task and enable you to conduct an evaluation that is doable with limited resources.

There are many approaches to evaluation and each of these approaches requires different levels of resources. I prepared a handout that you can read at your leisure that describes some of them. The Department of Education has prepared a document “Identifying and Implementing Educational Practices Supported by Rigorous Evidence: A User Friendly Guide” that describes additional approaches. This can be found on the web at

Because the User Friendly Guide recommends the use of control groups, randomized design, and a clinical trial approach,some programs may find difficult to implement the programs described in the document. The use of quasi-experimentation with comparison groups is an approach that seeks to overcome the barriers imposed by randomization and can give results that are comparable to those obtained by randomized trials.

In addition, clinical trials are focused on determining if an intervention produces the desired results and not in assisting the program to achieve its goals and objectives. While clinical trials may be essential for determining the efficacy of an intervention, the clinical trial approach may not be effective for determining the efficacy of the whole program.

Consequently, I advocate an approach based onutilization-focused evaluation developed by Michael Patton. This approach views the evaluator as a member of the program team who works interactively with staff to ensure that milestones, outcomes, and goals are reached.

Let’s take a look at one program as an example – Upward Bound. Upward Bound provides fundamental support to participants in their preparation for college entrance. It serveslow income high school students and those who are the first in their family to seek postsecondary education. The goal of Upward Bound is to increase the rates at which participants enroll in and graduate from institutions of postsecondary education. Upward bound projects provide instruction in math, laboratory science, composition, literature, and foreign languages in addition to mentoring, tutoring, and facilitating the transition to college.

The question of interest to the program and to the evaluator is “How well is the program accomplishing its goals of moving the targeted high school students into college?” They are not hypothesizing whether the services offered through Upward Bound enhance entrance into postsecondary education. The question for the program is one of dose and response, is there some relationship between the amount and/or intensity of services provided and the number or percent of students entering into and graduating from postsecondary institutions.

The Upward Bound Program mandates a detailed record structure from all grant recipients. This record structure contains 78 fields for each program participant. The data in the record covers the background characteristics, services offered, short-term outcomes, and long-term outcomes for all participants. These data can be used as the basis for an outcome evaluation that examines the dose – response relationship between services offered through the program and the extent to which the program is meeting its goals. Rather than the identification, collection, and analysis of new data, the program can conduct an evaluation using extant data routinely collected as part of its normal operation. Consequently, the requirement for new resources is minimized.

A logic model provides a picture of what the program hopes to achieve and a chain of connections showing how the program intends to work. The Logic model includes: resources or inputs, program activities, program outputs, short- and long-term outcomes, and the program’s impact. The Kellogg Foundation has published a good description of logic models and their use, which can be found at

The components of the model are linked by “what – if” relationships, such as what happens to a short-term outcome if the number and/or intensity of activities are increased. The flow chart below shows the elements of the logic model.

Resources/ Activities OutputsShort-Term Long-TermImpact

InputsOutcomes Outcomes

Resources include the human, financial, organizational, and community resources available. Inputs may also include factors that may affect resource allocation or outcomes. Activities are what the program does with its resources. Outputs are the direct products of the program and may types, levels, and targets of services. Outcomes are specific changes in participant’s behavior, knowledge, skills, and level of functioning. Impact refers to changes in communities, organizations, or systems that follow from outcomes – in some cases, the program may not be able to assess its impact.

I classified the data elements required for the Upward Bound record structure into the five types of core elements used in a logic model. Please note that those of you with more experience with Upward Bound may wish change this classification. This breakdown is shown in the table below.
Data Collection Elements

Upward Bound and Upward Bound Math-Science Projects

1

Inputs

Number of Students Entering Program

Number of Faculty Participating

Number of Schools

Social Security Number

Gender

Race/Ethnicity

Age

Eligibility

UB Initiative

Targeting

Participant Status

Participation Level

Academic Need

Grade Level at Entry

Date of Entry

High School GPA at entry

High School Cumulative GPA atentry

High School Cumulative GPA at beginning of reporting period

Activities

Mathematics Instruction/Tutorials

Mathematics Instruction Summer Program

Science Instruction/Tutorials

Science Instruction Summer Program

Foreign Language Instruction/Tutorials

Foreign Language Instruction SummerProgram

English Instruction/Tutorials

English Instruction Summer Program

Reading Instruction/Tutorials

Computer Instruction/Tutorials

Tutoring

Supplemental Instruction

College Entrance Examination Preparation

Peer Counseling/Mentoring

Professional Mentoring

Study Skills

Cultural Activities

Career Awareness

Campus Visitation

Assistance with College Admissions

Family Activities

TargetSchool Advocacy

Work Study Position

Employment

Math-Science Activities

Outputs

Projected Date of Re-Entry

Date of Last Entry

Reason for Dropout

Grade Level at end

High School GPA at end

High School Cumulative GPA at end

Number of HS Credits Earned

Short-Term Outcomes

College Entrance Exam

Type of Standardized Tests Used

PSAT Test Score

SAT verbal/math scores

ACT Scores

First Postsecondary Enrollment Date

Student Financial Aid Awarded

Postsecondary Enrollment Status

Long-Term Outcomes

Number of Post Secondary Education Credits Earned

Number of Non-Credit Hours Earned

College Grade Level at Program End

Postsecondary Academic Standing

Degree/Certificate Completion

Post Secondary Grading Term

1

Now, there are way too many data elements in the table to use simultaneously. However, it is possible to identify a number of smaller models. Let’s say you want to examine the effect of tutoring on changes in GPA, SAT scores, and the decision to enter college and its consequences. For this model the variables of interest might include:

Inputs – Number of Students, Number of Faculty, Number of Tutors, Gender, Age, Grade, and GPA at Entry

Activities – Math, Science, English, Reading, and Computer Tutorials

Outputs – Reason for Dropout, Grade Level, GPA at end of Program

Short-Term Outcomes – SAT Scores, ACT Scores, College Entrance Exam

Long-Term Outcomes – Postsecondary Entry Date, Degree/Certification Completion, College Grade Level

You can measure changes in GPA, SAT Scores, and the Postsecondary Entry decision as a function of the aggregate level of tutoring, controlling for Gender, Age, Grade, and GPA at Entry. You would be looking for 1) increases in GPA over time to be positively correlated with aggregate level of tutoring, 2) high SAT and ACT Scores are associated with increases in GPA, and 3) entry into postsecondary institution is associated with high SAT and ACT Scores. All relationships should be mediated (controlled) by examining the impact of gender, age, and grade, e.g., are the correlations the same for both genders.

There are many possible interrelationships among variables that could be investigated. It is the role of the evaluator and the program staff to identify the outcomes that are most important to them – the outcomes that the staff believes define success. More importantly, these relationships should be assessed early and often. If evaluation of data at the beginning of the programdoes not demonstrate the hoped-for relationships, then the staff has time to modify the program to ensure that goals are met. It is important to identify successes and failures early in the program when there is time to amplify positive aspects and correct negative aspects of the program’s design and implementation.

The model presented above does not meet the “gold standard” of the clinical trial. Nevertheless, it provides the program staff important information to assess the program’s effectiveness. Thorough evaluations, evaluations that ate targeted to each of the programs outcomes and goals, can provide a complete roadmap of the program’s processes and outcomes. Demonstrating the effectiveness of the roadmap provides the program administrators with strong and convincing evidence for continuation of existing funds and justification for new funds.

1