GORP Faculty Report
Dear Professor ______,
Once again, thank you so much for participating in the pilot project to test the GORP observation tool here at BU. Below is a summary of the observation data we collected during your class meetings along with a summary of the results of the pilot project from all the data we collected. A similar summary was prepared for each instructor that was observed. The data we collected from your classroom will not be explicitly shared with anyone outside of the GORP team without your express consent.
The GORP/COPUS protocol (see Lund et. al., attached) is designed to identify what activities the instructor and students are engaged in during a class meeting in a neutral an objective way using observers with no special knowledge of pedagogical design. During a class meeting, for each two minute interval the observer records, via a web application, what kind of activities the instructor and the students are involved in. The activities that the observers can choose from are given in Table 1 of the attached paper. Notice that it is possible during each time interval to record more then one activity, so percentages of activities will add up to more then 100%.
Based on the data reported on in Lund et. al., a cluster analysis was completed to identify specific types of classroom instruction. The cluster analysis identified both instructor and student activities that tended to occur simultaneously. For example, clusters that were categorized “Lecture” included instructors lecturing, real time writing, and students listening. These clusters fell into four broad categories including: traditional lecture, socratic method, peer instruction, and collaborative learning.
A summary of the observations is given in the graph below. These represent the average over all of the class meetings observed.
A total of 4 class meetings were observed, each observed class on 4/17, 4/22, 04/24, and 04/27. Three observers were present on 04/22. This allowed for an estimate of inter-rater reliability (IRR). Inter-rater reliability can be estimated by comparing which items were chosen by each observer during each two-minute period. For this meeting at which multiple observations were recorded, the Cohen’s Kappa for observations on instructor behavior (measure of IRR) calculated is
66% (avg. of 3 pairs).
This is lower then we would like, but we think it underestimates the effectiveness of the tool since the individual observations showed similar values to the averages given above.
Keep in mind that each entry above represents the percentage of 2-minute intervals where the behavior was observed. Multiple behaviors (for example lecturing and clicker questions) can be present in a particular time interval. Based on these results using the classifications in Lund et. al., your class meetings are consistent with the category of “Group work ” or “ Extensive Peer” teaching.
Additionally, observers estimated the level of student engagement during each 2-minute interview, choosing between categories of “low”, “medium”, or “high”. Use of this feature was not consistent across observers, but based on the observations that were collected, student engagement was routinely rated between “medium" and “high” and most often “high”.
Based on the observing experiences in your classroom and others we learned several things that will help improve the future implementation of the tool.
A few of the things we learned include:
• Present and former LA’s are a valuable pool of observers.
• How to improve our training of the observers to increase inter-rater reliability.
• To improve communication with faculty being observed.
• How to categorize various teaching styles and approaches based on the data.
• The tool is imperfect and does not fit all classrooms equally, necessitating the development of our own protocols when that feature becomes available.
We plan to continue observations in the Fall of 2015, implement improvements as indicated above, and eventually provide it as a broader service across the university.
This report is of course is a general summary of the data collected. More detailed results from your individual class observations, including observer comments are available as well. Let me know if you would like to look at the more detail, and perhaps we can schedule a time to discuss what questions you would like answered.
We are planning to write a general report on the pilot program. Do you give your consent to release the summary results of observations of your course displayed above together with the Lund et.al. classifications? We will anonymize all information.