Stevens BS-CS Program Evaluation Report for 2007-08

The report is divided into two sections, process and results. The process section evaluates the process used to assess the program. The results section evaluates the extent to which the program is meeting its goals (i.e., objectives and outcomes). Each section contains two subsections: discussion and planned improvements.

1. Process

1.1. Discussion

This year was the first in which the CS department performed quantitative assessment. Prior to this year, the only information gathered about program effectiveness was a per-course survey of student opinion. Students responded to "Dean Russ questions:" questions invented years ago by Dean of Undergraduate Academics Larry Russ and used ever since. The first two Dean Russ questions ask for evaluations, on a 5-point scale, of the instructor and the course. Later questions are more pointed, asking about the value of the textbook, whether work was graded promptly, etc.This year we retained the Dean Russ opinion survey but also added instructor-compiled direct evidence of student success in achieving per-course outcomes.

At the beginning of the school year, the department wrote outcome statements for every course in the curriculum---required or elective, graduate or undergraduate---and also program outcomes for each of our three BS programs: computer science (BS-CS), cybersecurity (BS-CyS) and information systems (BS-IS). The per-course outcomes were written by the course coordinator (course coordinators are identified at and while the program outcomes were written by the curriculum committee. The creation of the two types of outcomes were largely separate activities. We neglected to write objectives for any program. Course outcomes are available on the same pages as the course coordinators. Program outcomes are available at

With course outcomes in place, instructors were required to map assessment instruments to course outcomes and measure the degree of student success in achieving course outcomes. This effort was required in every course, required or elective, graduate or undergraduate, for the following reasons:

  1. Although we try to have required undergraduate courses taught year-in and year-out by the same faculty, inevitably there are some personnel changes due to faculty arrival/departure, sabbaticals and leaves, and faculty desires to attempt new subjects. Therefore, it is desirable for all faculty (we are a small department of only 16 full time faculty) to know the procedures for quantitative assessment of course outcomes.
  2. Quantitative assessment is a trend throughout college academics and more specifically Stevens has been mandated by its institutional accreditation agency, the Middle States Commission, to start assessing all degree levels. Therefore, we anticipate the eventual requirement to assess all programs, undergraduate and graduate. Effort expended now in assessing, say, graduate security courses will ease the transition when in the future the graduate cybersecurity degree (for example) must be assessed.
  3. The additional effort to define outcomes, map course outcomes to program outcomes, map assessment instruments to course outcomes, compile student performance data, and retain examples of graded student work is quite substantial. We thought it would be best for morale if everyone would go through the process together rather than limiting the effort to only the unlucky subset of faculty who were assigned to required undergraduate courses.

For course-level assessment, we borrowed and adapted two forms from our engineering colleagues who are also doing ABET assessment of their programs: the SPAD (Student Performance Assessment Data) form and the ICA (Instructor Course Assessment) form. We created a third form, the PCR (Pre-Course Review).

The SPAD (available at is a table in which each row lists a course outcome, assessment instrument used to assess that outcome, and the number of students whose performance on the instrument is deemed "unacceptable," "acceptable but not proficient," or "proficient" (aka low/medium/high). On a particular instrument, each student's performance is placed into exactly one category. The instructor determines what level of performance is low, medium, or high. The instructor is best able to make this determination. Providing for per-instrument definitions of low/medium/high provides valuable flexibility; e.g., one instrument can be scored 0-100 where another can be scored A-F, etc. Instructors are responsible for submitting SPAD forms shortly after the end of the course.

The ICA is a short form, also submitted shortly after the courses ends, on which the instructor expresses his/her opinion about how well the course succeeded in establishing outcomes and what should be improved in the next offering. The ICA is available at

The PCR is a short form submitted by the instructor BEFORE the beginning of the course, on which the instructor indicates what evidence from prior offerings the instructor considered as part of his/her preparation of the course and what improvements he/she plans based on this information. The purpose of the PCR is to "close the loop;" i.e., demonstrate continuous improvement at the per-course level based on past assessment activities. The PCR is available at

Following the end of the school year, SPADs from all required courses were reduced to several spreadsheets, one for each program outcome and one overall for the program. These spreadsheets are available at The summary spreadsheet is reproduced below:

The summary spreadsheet indicates the measured success---across all instruments in all courses---of the program in establishing its outcomes. There are 19 outcomes (ABET A-K map to 1-11 minus number 6 and plus number 16) and not all were assessed. However, all ABET-required outcomes were assessed.

In addition to direct data recorded on the SPAD, we conduct a yearly "senior exit survey." Exit surveys in previous years asked students generally what they thought of the program. This year we asked them 19 questions, one for each program outcome. Each question simply converted a positive outcome statement into question form. For example, outcome #11 "Analyze the local and global impact of computing on individuals, organizations and society" on the survey becomes question #11 "I am able to analyze the local and global impact of computing on individuals, organizations and society." Students respond on a 10-point scale. The results are available at and reproduced below re-scaled from 0-10 to 0-100. The column marked "Student number minus SPAD number" for each outcome subtracts the SPAD-produced measured success rate from the result of the senior survey. This is an attempt to determine whether student perceptions about their learning match the measured learning. A negative number in this column indicates that graduating seniors have a less sanguine impression of their learning than is indicated by classroom performance; a positive number indicates that students are more optimistic than they should be that they have achieved a particular outcome. It is not clear that either a negative or positive number should be considered "good," rather we feel that there should be a rough match between perception and reality.

In addition to evaluating outcomes by direct measurement, it was our intention to survey both alumni (one year out of school) to ask their opinion, in the light of experience, whether the program was successful in establishing its outcomes.

Assessment by the curriculum committee following the school year revealed the following weaknesses in our assessment process:

  1. We did not have program objectives.
  2. We did not survey alumni to gather their opinion about how well the program established its outcomes. The effort of creating outcomes and convincing instructors to assess their courses was great enough that we did not have energy to tackle this requirement.
  3. Our approach to creating outcomes was wrongly bottom-up instead of top-down. We realized only after finishing course and program outcomes that we should have used a sequential process: first define program objectives, then define program outcomes in support of the objectives, then define course outcomes in support of program outcomes. We instead neglected objectives altogether and define program and course outcomes virtually in isolation from one another.
  4. Some program outcomes receive light coverage in the curriculum. For example, outcome #9 (ABET H) and outcome #11 (ABET G) are each covered by a single instrument in a single course. While this may be sufficient, it raises the danger that a single change in course delivery might leave the outcome uncovered in that year. Further, coverage by a single instrument in a single course raises the question of whether the program outcome is taught in sufficient depth.

1.2. Planned Improvements in Assessment Process

For the coming school year, we plan to address each of the above process weaknesses as follows:

  1. In concert with stakeholders in the program, we will define program objectives.
  2. We will survey one-year-out alumni regarding outcomes.
  3. We will survey three-years-out alumni regarding objectives. We choose to survey about outcomes after one year since graduates are more likely to have technical positions that test their ability in areas covered by our program outcomes. We choose to survey about objectives after three years because objectives are more high-level concepts best evaluated after gaining some experience.
  4. We will survey employers of alumni regarding objectives. Alumni and their employers are the primary stakeholders in the program.
  5. Although we now recognize that our bottom-up approach to creating outcomes was wrong, we will not make changes to course or program outcomes. The existing outcomes meet ABET requirements and are otherwise adequate, and changing after only one year would be too disruptive since most faculty are still getting used to the system.
  6. We saw no simple and logical way to increase coverage of lightly covered outcomes such as outcome #9 (ABET H) and outcome #11 (ABET G). Therefore, we will monitor this point in future years.

2. Results

2.1. Discussion

Ideally, all students would score at 100% even in a demanding curriculum. This is unrealistic, but we have no solid basis for saying what level of achievement is adequate. Therefore, our approach to curriculum/teaching revision is to consider this first year as a base and strive to improve every year hereafter.

In analyzing the summary numbers above, two points stand out:

  1. Two outcomes have much lower SPAD scores than all the others. Outcome#11 (ABET G) "Analyze the local and global impact of computing on individuals, organizations and society" is at only 55.6% whereas outcome #14 "Describe network environments including the protocol stacks underlying internet and Web applications, and develop and run distributed programs using sockets and RPC" is at only 50.8%
  2. For several outcomes, there is substantial disagreement between the SPAD scores and the scores from the senior exit survey. In 6 cases, the student score is less than the SPAD score by double digits.

A third issue, mentioned above in the process section, is the light coverage (only one instrument across the entire curriculum) of outcomes #9 (ABET H) and #11 (ABET G).

2.2. Planned Improvements in Curriculum and Teaching

For the coming school year, we plan to address each of the above issues as follows:

  1. Work with instructors of the "Computers and Society" course in the Institute's humanities college. This course is used to establish outcome #11 that in 2007-08 had a measured success rate of only 55.6%. The curriculum committee will explore whether the problem is with the material, the assessment instrument, the instruction, or something else.
  2. Work with the instructor of CS 135 regarding outcome #14 (only 50.8% success).
  3. Monitor the situation of substantially different scores from direct measurement and student opinion in the senior exit survey. While there are several cases where the students evaluated the program lower than indicated by direct measurement, there are also some cases where the student score was substantially higher. We don't know whether these differences have meaning or are somewhat random.
  4. We saw no simple and logical way to alter the curriculum to increase coverage of lightly covered outcomes such as outcome #9 (ABET H) and outcome #11 (ABET G). Therefore, we will monitor this point in future years.