Introduction

Introductory statistics at WPI consists of a two-course sequence, MA 2611-MA 2612. Each course is taught in lecture format for four class hours per week. In addition, students in each course must complete a laboratory sequence. These laboratories are designed to illustrate statistical concepts or methods through hands-on data production, computer simulation, or computerized data analysis. Currently, the statistical software used in the course is SAS version 8. Labs rely on SAS/INSIGHT, an interactive, graphics-driven component of SAS, and a set of SAS macros designed to perform specific data manipulation or graphical display tasks.

While the present lab setup has worked well for a number of years, there is room for improvement in a number of areas:

  • Access. The labs are supported only on WPI’s Unix system, which is not available off-campus.
  • Inter-activity. The SAS macros do not offer real-time graphical inter-activity with the user.
  • Interface. The labs do not provide a unified interface. In particular, lab instructions differ from instructor to instructor, and are often given out to students as written worksheets.

It was with the idea of improving the labs, in these areas, that this IQP was undertaken. The main goal of the project was to improve the technological means (the labs’ presentations) to a societal end (educating students in introductory statistics), which is the essence of an IQP. The specific technological means were the creation of a unified set of web-based labs based on existing SAS-based labs. These labs include written explanations, lab directions, links to glossaries of relevant statistical concepts and follow-up questions. The main concepts are illustrated in an accompanying set of JAVA applets.

In creating these web-based labs, we first studied educational issues in computer instruction, including ideas about how students learn, and their relation to the design of software and user interfaces. Prototype labs were tested on users of varying computer and statistical sophistication, and modified according to the feedback obtained. As a result, we feel confident that these labs will better meet the needs of students and instructors in introductory statistics.

This project report is organized as follows. Chapter 1 discusses the educational issues we considered in designing the labs. Chapter 2 details the specifics of the labs we modified, and explains the modifications made in creating their web-based analogues. Chapter 3 describes the testing and modification of the labs, also including our evaluation of the project in terms of both what was produced and what was learned. Chapter 4 is a programmer’s reference, giving the structure and details of both the HTML and JAVA code. It is hoped that this reference along with the enclosed disk, with the complete programming code, will prove helpful to any who undertake similar modifications of other SAS labs.

1. Educational Issues

1.1 Principles of Laboratory Design

This section describes the results of our research into educational issues in computer instruction. It is organized by the general principles we learned and applied in the design of the labs.

Give Explicit Directions

There shouldn’t be any misunderstanding as to operating the computer interface. For an example, in lab 2.1 the student is given the ability to move data points on a box-and-whiskers plot. In order to move a point, the student must first locate that point on the plot. He must then, click and drag the point to its desired location. This appeared a simple concept, but subsequent testing revealed that this description is not as clear as we thought. To resolve this issue, we included a summary of how to move the data points in the lab’s online introduction. In the introduction we wrote, “First, imagine there is a horizontal axis through the vertical center of the box-and-whiskers plot. As the pointer is moved along this axis, the name of a state along with a value will appear to the right side, underneath the box-and-whiskers plot. These correspond to the data point that the pointer is positioned over. When the mouse is positioned over a point with a value and state, you can click and drag that point to a new value.” In the instructions, we refer to this, stating “Following the instruction given in the introduction, try to move the data point on the plot.” We felt this eliminated any misunderstandings. In testing, the only students who had problems were those who did not read the introduction closely. One student could not move a point and he was instructed to re-read the introduction. He was then able to move the point.

Keep the Student’s Interest, But Do Not Overwhelm

Keeping the student’s interest, without overwhelming her is important. In Lab 2.1, we separated the procedure of how to drag the points between the introduction and instruction. We felt that a single instruction, encompassing the entire procedure, would be too lengthy and lose the students’ interest. We didn’t do comparison testing between a single, long, instruction and the separated version. No students had problems with the separated version.

Keep the Student’s Interest, But Do Not Oversimplify

It is equally important to not oversimplify the interface. Interest is easily lost when too little is required of the student. For example, with only one possible command, the student may not think about what he is doing. Introducing several buttons for different commands can help prevent this. With only one choice, the students may lose valuable intuition.[1] In Lab 2.1, there are several buttons, each serving a different purpose, labeled “Remove Outliers”, “Reset” and “Trim”. Even though the instructions told the student which to press, the lab tasks were tailored to make him think about the proper choice.

Compose a Purposeful Introduction and Finish

It is important to have an introduction that is stimulating and a finish that provides closure to the lab.[2] We accomplished this by using a title page and a closing page. Our title page contains a brief summary of the material to be covered and the objectives of the lab. The closing page contains a recap of the important topics covered in the lab. For our closing pages, we have summary questions.

Remember, the Student Must Learn for Herself[3]

This is closely related to the previous topic about endings. The ability to do the lab doesn’t mean the student fully understands the material. The student must reflect on what she did. For this purpose, we have included summary questions with each lab. These questions make the student more productive if directed toward what is wanted in a lab report.

Choose Order of Presentation Carefully[4]

Good organization will help the student make connections between computer display and statistical concepts. For example, Lab 7.1 is concerned with choosing regression lines by eye. Our lab includes two plots: one a scatterplot and the other, a residual plot. Both plots have the same scale for the x-axis values, which makes it easy to identify corresponding residual-scatterplot pairs. Through this, the student can gain a better understanding of where residuals come from.

1.2 Why our approach is better

SAS Has Its Difficulties

Currently at WPI, the computer components of the labs are written in the SAS programming language. SAS macros take the burden of programming off the student. SAS/EIS (Executive Information System) serves as a window interface between the student and the SAS macro. We found several difficulties associated with using SAS-based macros. First, the student must know how to activate the macro. It may seem unlikely that students would have trouble with this, though in our testing of the labs, we found this to be a legitimate problem. We also found that the multiple SAS windows confuse the students and impede their ability to learn.

A Web-Based Interface is Easier to Use

Testing showed that the proposed web-based interface will be easier to use. With a simplified interface, energy used to learn the interface will be conserved and redirected toward understanding the material[5]. In our applets, the input and output share a window, the Internet browser. Our simpler interface and the familiarity of the Internet browser allows the students to concentrate more on the material presented.

The Convenience and Comfort of Our Labs

Through a more personalized learning environment, students can become more comfortable and learn more easily[6]. The present SAS based labs can only be run on campus, which reduces the options available to students[7]. With JAVA the labs will be accessible on any computer running Internet Explorer or Netscape web browsers. With this feature, students can work virtually anywhere they choose. This is not so much an advantage for starting the labs. Students will still have to go to lab and be instructed by a TA. However, with this accessibility, the students can experiment and “play” with the labs in their own environment. In this time, the students can truly learn key concepts that they perhaps didn’t comprehend in the class.

Real Time Capabilities

The web-based labs enable students to see real time results. As the values are altered, the changes can be seen continuously. In the SAS labs, the student can change a data value from one value to another and see the effect this change has on the summary statistics. With the web-based labs, a point can be clicked and dragged and a continuous update in the summary statistics can be observed. This is another way the students can “play” with the data.

Save and Print Capabilities

With SAS, a student can print and save graphs in a wide variety of formats while using a UNIX machine. When using his or her own computer, a student may not have access to SAS, eliminating print and save capabilities.

With the JAVA applets, students may print or save plots as postscript files when using a UNIX machine. Postscript files can be printed, incorporated into some kinds of documents, such as Latex documents, or converted into PC friendly formats, for example, GIF. Students may also print when using a PC, though at present can only save graphs as .prn files, which is a proprietary format.2. Lab Design and Creation

This section presents our project’s methodology, and the logic behind the design of the web pages and each lab.

2.1 Methodology

Development of each lab proceeded as follows:

  1. Create a story board

We started with a blank piece of paper and began drawing what we thought the lab would look like. We created storyboards for the applet and the html page, where links to the applet were located. We determined not only the appearance of each lab, but also decided how the students should proceed through the lab.

The storyboarding of the labs made us decide what we wanted as an end result. It gave us direction and allowed us to distribute the workload evenly. It also enabled us to gauge our progress, preventing us from getting hung up on minor issues for too long.

After the first storyboard, it was evident that writing appropriate text for the web site would take as long as the applet programming. One of us worked primarily on the text, i.e. glossary, instructions, background, introduction, etc., while the other worked primarily on the computer programming of the labs. We met daily to discuss the progress made and collaborated whenever necessary.

  1. Discussing the storyboard with our advisor

Before beginning the production of a prototype, we met with our advisor to brainstorm how to improve our design. Upon approval, we would proceed.

  1. Create prototypes

We created non-operable prototypes of the web pages and applets. These were skeletons of what the web pages were going to look like. It was necessary to create prototypes only once, since the structure was reused for all the labs.

  1. Review overall lab design
  2. Produce the lab

We next created the applets and wrote all pertinent text. We made adjustments to the structure of the labs as well. For example, we decided not to include a “Background” section. In lieu of this, we included a more in depth introduction. We didn’t make any other structural changes to our labs.

  1. Test

To ensure the success of these labs, it was necessary to test them on students. We had students do the labs while we observed. They were then asked to answer questions. This procedure will be discussed in more detail later.

  1. Final Revisions

We took what we learned from the testing, and used it to help us improve what we already had.

Web Page

The goal of these pages was to make it possible for the student to easily navigate through the lab. We felt this would be achieved by using frames to include a menu bar on the left side of the page. By using frames, the student can work on the applet and refer to other parts of the lab at the same time. Had we not used frames, the student would have to stop the applet in order to refer back to other parts of the lab.

Figure 2.1 is an image of our final design of the web pages:

Figure 2.1

An important issue with the menu bar was: What should we include on the menu? Originally, the menu was going to consist of the following links;

  1. Introduction
  2. Applets
  3. Background
  4. Glossary
  5. Summary Questions

The introduction would tell the student, in general terms, what they were going to do in the lab. The Applets link would open the applet so the students could start the lab. The Background section would consist of a tutorial on all the concepts presented in the lab. The glossary would be just that, a glossary of relevant terms, formulas and concepts. And the Summary Questions were aimed to encapsulate all the objectives of the lab.

After writing two labs, we decided to omit the background section. Since the labs are designed to correspond to the book, the background section merely repeated material that should already have been studied in the course.

2.2 Lab 7.1

The topic of the present lab 7.1 is least squares fitting of a line to a set of x-y data. When the SAS macro for this lab is called, a scatterplot appears with a set of x-y data, having linear association (each call of the macro generates a different set of data). A dialogue window also appears asking the student to enter a guess of the slope and intercept of the least squares line. The error sum of squares (SSE) of the guessed line is displayed in the dialogue window and the graphics window contains two graphs: the scatterplot with the guessed line superimposed and a scatterplot of the residuals from the guessed line versus the x variable. These two graphs cannot be viewed at the same time, but must be scrolled through. The lab instructions ask the student to record the guessed line and its SSE. The student can also print either of the plots.

The dialogue window next gives the student the option of making another guess, of seeing the equation of the least squares line (the line that minimizes the SSE) or of quitting the macro.

The educational goals of the lab are for the student to:

  1. Internalize the idea that the least squares line minimizes SSE
  2. Get a feeling for what a least squares line looks like
  3. Make the connection between data, fitted line and residuals.

Typically, the student makes several guesses for the least squares line and then obtains the true least squares line. It is hoped that the educational goals will be realized through the process of guessing, viewing and comparing that the lab entails.

In developing the web-based lab, we made four fundamental changes to the SAS lab.

First, guessed lines are drawn by using the mouse, rather than by specifying the slope and intercept as in the SAS lab. In addition, the mouse may be used to move the line once it is drawn. There is no similar capability in the SAS lab.

Second, up to three different guessed lines can be displayed on the scatterplot, as opposed to one in the SAS lab. The statistics associated with each displayed line are also displayed.

Additionally, the student can select a line to see its residuals. This allows students to compare their guesses without scrolling or printing the graphs, though the graphs can be printed, if desired.

Third, in this lab, the two plots are oriented side-by-side. This makes it easier for students to see the relation between the line, scatterplot, and residuals, than if they have to scroll between them as in the present lab. In our research, we found that students learn better when associated objects are grouped together. With this in mind, we made the scale of the residual plot and scatterplot’s x-axes the same. By having this feature, in conjunction with the plots being side-by-side, the student can more easily identify the point on the scatterplot that corresponds to a residual.