Sarah L. Beery
Action Research Paper
Spring 2005
Statement of Research Question
Students in my classes, as a generalization, are overwhelmed when it comes to exam time and they have an entire semester’s worth of information to study. The assessments they are given are usually at the end of a topic or chapter and cover only the new information. I would like to lessen the stress of the final exam, and increase their retention of all course topics. Can I improve student semester exam scores by varying the type and usage of unit assessments, subsequently make the unit assessments more comprehensive, and provide electronic review materials they will be able to access anywhere? I teach at a high school building that is composed of approximately 400 students, the majority of which are Caucasian. We do have a significant population of Hispanic students, but very few African American students. A large percentage (the exact number is unknown) of our students come from single parent or second marriage families, and a significant number receive free or reduced lunch.
This project will be aimed at my Physical Science students. This year I have been teaching four different Physical Science classes, which includes approximately 20-25 students per class, and a total of 85-90 students at any one time during the year. In one of these classes the majority of the students are classified as special education students. The grade level of the students is primarily 9th graders, but there are also several 10th graders. In the past, I have had significant migration in and out of these classes because of students moving in and out of the community, and students moving to the district’s alternative program. The same has been true this year.
Previously Physical Science was not a requirement, and so approximately half of the entering freshmen would opt for Physical Science, and the other half would opt into Biology. This year, however, Physical Science was a requirement, unless the incoming freshmen passed a placement exam. Typically, this class was composed of the lower level learners, but this year, because of the testing-out requirement, the class will be composed of a wide variety of learners, presenting even more of a challenge for myself as a teacher.
Related Research and Information
Improving student retention has been a long-term issue in education, one that still continues today. With increasing requirements set at both the national and state levels, student retention and student achievement are becoming more and more important for schools. Also, our students are falling behind in achievement compared to other nations. This decline was documented as far back as 1967, and comparisons to students 20 years later continued to show a significant decline in achievement (Bishop, 1990). As America continues to grow older, lack of student achievement continues to become a bigger and bigger problem, leading to more and more requirements from both the state (Standards and Benchmarks) and national (NCLB) levels set for schools.
Many approaches have been tried to increase student retention (and consequently, student achievement). The issues of “alternative assessment” and “test anxiety” have become keywords used in schools. Maybe our students are learning the same information as other nations, but our society has such a fear of being tested (and of standardized tests) that we as educators need to become more creative in our assessments. Studies have been done on some of these alternative approaches, including one involving using portfolio assessment (Childers, 1997). Although there were positive effects, many problems were revealed in this study. One of the biggest problems in this study was lack of student interest. Some students viewed the portfolios as busy work. In fact, the lack of student interest in their achievement has become a problem documented in more and more research (Bishop, 1990; Childers, 1997; Feldman, 2000). Another study (Crow, 1997) lists journals, interviews, observations, surveys, and rubrics as forms of alternative assessment. Although all of these approaches have positive benefits, the approach that seems to capture the students’ interest most effectively is using technology.
The effects of technology on improving student retention and achievement are varied, but a common theme amongst the published research is most effectively stated as follows: “The effects of technology are dependent on the context in which the technology is applied.” (Hawkes M. & Cambre, M., 2001) None of the research studied disputes the fact that technology is an integral part of our society, and it is extremely important to include in our education system. However, research involving usage of technology in the classroom requires careful interpretation, mainly because of the fact of technology itself and its dynamic nature (Kimble, 1999). A study done today may not be accurate in a few short years.
Most of the research reported seems to use standardized tests as a measuring device for whether or not technology has made an impact. In order to effectively determine if this is true, educators must use assessment methods appropriate to the learning outcomes promoted by the technology used (Honey, et al, 1999). This leads into the research done here, and raises an important question-will the methods of assessment used match what the technology is promoting?
I really think that Gardener makes valid conclusions when he speaks of multiple intelligences, and I am positive that some of these electronic reviews and assessments met some of these different learning styles. Having 8 computers in my classroom, I was able to develop some of these different assessments and implement them easily so that they were available to most (if not all) of my students. Writing the actual lessons was much more difficult. My colleagues offered some well needed advice, and most of the materials developed were assessed by them. By using technology in these different assessments, I really increased the amount of cooperative learning as well as collaboration between individual students while the students were using the technology (something I did not emphasize as much previously).
Description of Intervention
The final product of this research is a comparison of the data (assessment scores) from the previous two years' assessments and the previous two years' semester exams to the data taken from this year’s assessments and semester exams. By making the assessments more comprehensive, I hoped to make a significant impact on the semester exams. Therefore, to begin, I needed to go back and get the raw data from the assessments I gave in the last two years. I analyzed the performance of all of the Physical Science students from last year by looking at the averages of the assessments. I then divided these averages according to special education students, regular education students, and each individual class based on the hour of the day that the students took the class. I planned on using the combined averages of the first two years as a control group, and the averages from this year's assessments as my experimental group. However, I found the results more useful by comparing them separately.
Before the year began, I looked up the science grades that each of my students received in 8th grade science. One of the strategies I tried was to use the data from their previous performance in order to decide which class (hour of the day) would partake in which form of intervention I planned on trying (Brimijoin, 2003). Therefore, to improve my Physical Science students' retention of the topics studied over the semester, I chose several different methods and tried them out with different classes. One method was to simply increase the number of questions of previous topics on their chapter tests. For example, Chapter 3 test would contain 5 questions from Chapter 1 material and 5 questions from Chapter 2 material. This pattern continued throughout the semester; however, the number of questions per chapter decreased the further into the semester the class was. Otherwise, the length of the test would have defeated the purpose of the study, and instead increased the student's stress levels.
The second method used was to just give the previous chapter tests to the class, and then spend much time reviewing at the end of the semester. The reviews were electronic (see some of these here), mainly utilizing Microsoft PowerPoint in these reviews. Using the technology increases the students' motivation in learning the material, and at the same time became a medium through which they could practice what they had learned. I created several StAIR projects, as well as Jeopardy games that the students had access to online through my website. Not only could they access these at home, but we also spent several class periods going through some of these as a class, and some of these were done as cooperative learning. The students had the choice on whether or not they worked alone or with a partner.
The final method used was done with my most disruptive class-the least motivated individuals-and my special education class. I increased the number of questions about previous materials on their chapter tests, and I also spent time doing electronic reviews with the students. We also spent some time working independently, where the students had a choice on whether or not they did the electronic reviews or studied with a partner or on their own. Going into the project, I felt that this approach covered the widest variety of methods, provided the most review, and gave the students the most options for them to be in control of their learning. I believe that provided students control over their learning and giving them options of how to learn is one of the key motivators for students in society as it is today. If the student feels that they have options, they are more motivated to choose an option and work hard at it. This hypothesis was the main reason why I used this approach in two different classes.
Technology did play multiple roles, and without the use of technology for both myself and my students this study would not have gone as well as it did. First, I used Microsoft Excel to track and record all of my data. Because I wanted to be able to see individual scores as well as averages, Excel was a great tool to help me organize this data. Second, I used StAIRs as a form of review that was used throughout a unit, and these were created in Microsoft PowerPoint. For several internet research projects, the students used Microsoft Word (to type up their results) and the Internet. My classes also used Microsoft Publisher for a project that was used as a form of alternative assessment.
Specific Research/Evaluation Questions
The following questions sum up the ultimate goals of this study.
1. How did my students' chapter test scores change based on the method of intervention used?
2. Was there an improvement in student retention (and consequently student achievment) throughout the semester?
3. Was there an improvement in students' semester exam scores?
Data Collection
The test scores for Chapters 1, 2, and 3, as well as the first semester exam, were collected and analyzed. These were compiled and can be seen in the table here. The totals can be found in a table here. The collection process was fairly straightforward-official gradebooks from all three years were found and the scores were taken directly from these books.
The data collected for which students used the online reviews was collected during the week of January 17-21 when all students were in attendance for that particular hour. They were asked informally whether or not they used the online reviews (both inside and outside of class), how many times they used them, and whether or not they found the online materials useful. They were told that their answers would be completely anonymous, that they were for my graduate school research, and that their answers would not affect their grade in any way.
Data Analysis
Data analysis was fairly straightforward. All scores were averaged, and the averages were then compiled into a Microsoft Excel data table (seen here). Once this was done, it was very easy to compare the averages and determine whether or not the study was effective, data-wise. However, I also analyzed and kept a few notes about the attitude of the students and the morale of the class at exam time.
The survey data was also looked at to see how many students took advantage of the reviews available to them. I then looked at the number of students that used these reviews, and how their class averages compared to the totals compiled for each year. This was a little complicated, but one of the more useful and important pieces and conclusions for this study.
Report Findings
Data Collection:
The results of the data collection (seen here) showed a slight increase in this year's scores in all categories except two. I compared the average percentages from the Chapter 1, 2, and 3 tests, as well as the Semester Exam, for the past 3 years (02-03, 03-04, 04-05). I broke down each of these categories into regular education students, special education students, and the overall average. This breakdown was done by which hour of the day each class was taught.
As mentioned above, only 2 categories did not show an increase in the average score. Both of these occurred when comparing 1st hour, 02-03, to this year's data (04-05). Special education students' average two years ago on the Chapter 1 test was 1.5% higher than this year, which affected the overall average to be 2.7% higher in 02-03. Other than these two exceptions, the average scores for 04-05 were higher in every category.
It was determined that all methods used in this study were effective.
Interview:
The interview results showed that 64 Physical Science students used the online reviews in-class, which (at semester time) was 74.4% of the students. Outside of class, 18 students used the reviews, which was 20.9% of the students. Another 10 students stated that they attempted to use the reviews, but they were unable to access them. The major reason given for this was that their computer did not have MS PowerPoint.
The students were also asked if they found the internet reviews helpful or not. All of the students that used the reviews found them useful, with the exception of three students in the 6th hour class.
Making Sense of It
Along with making the assessments more comprehensive and using technology to review, I originally planned on trying to use several alternative assessments also (Childers, 1997; Crow, 1997; and Fitch, 1993). The plan for effectively integrating this part of the project was to create alternative assessments to be used throughout the entire unit, as well as recreating some of my unit tests to incorporate questions that test the student’s knowledge from previous chapters/topics. I had planned to use technology such as StAIRs, WebQuests, and other software (such as the MS Office suite) incorporated into projects as a review tool for assessments and also to use them as assessments themselves. However, as reality hit, I was able to only do about half of what I had planned. Although I did create StAIRs and reviews, and recreated some of my unit tests, I was not able to use as many alternative assessments as I had wanted to try.
The overall results of the study did not really surprise me. Even though I made a lot of changes and provided the students with many more review opportunities, I also know that I am a young teacher. Is the fact that this is my third year of teaching and I am probably just improving overall as a teacher making more of a difference than the technology? This is one of the big questions that comes up as I look at my results. True, I did have a definite increase in overall averages, but was the difference significant enough to prove that the technology was the reason? To compare and try to answer this question, I looked at the increase (or lack of) between 02-03 and 03-04, and compared it to the increase between 03-04 and 04-05 (data seen here). In 03-04, the increase in semester exam scores was -3.5%. The increase in semester exam scores between 03-04 and 04-05 was 4.9%.
Between the first 2 years, the average exam scores actually went down, and by a significant percentage (3.5%). Does this reflect that I actually got worse between my first two years of teaching? I don't believe so, and this led me to another question (statement) about my research. Different students from year to year can make a huge difference. Also, in this research, the number of students studied changed significantly from year to year, which could also be a contributing factor. Therefore, even though my data show that the year I used technology produced the best exam scores, I feel that I cannot make any solid conclusions without a few more years of research.