Reflective Ability Scoring Rubric

Reflective Ability Rubric and User Guide

Patricia S. O’Sullivan, EdD

Professor of Medicine and Director of Educational Research and Faculty Development

Office of Medical Education

School of Medicine

University of California, San Francisco

521 Parnassus Ave Box 0410

San Francisco, CA 94143-0410

(W) 415 514 2281

(F) 415 514 0468

Louise Aronson, MD, MFA

Associate Professor of Medicine and Director, Northern California Geriatric Education Center and UCSF Medical Humanities Initiative

Division of Geriatrics

3333 California Street, Suite 380

San Francisco, CA 94118

(W) 415 514 3154

(F) 415 514 0702

Eva Chittenden, MD

Assistant Professor of Medicine

Harvard Medical School and

Director of Educational Programs

Palliative Care Service

Massachusetts General Hospital

55 Fruit St. FND 600

Boston, MA 02114

(W) 617-7249197

(F)617-724-8693

Brian Niehaus, MD

Research Assistant

Office of Medical Education

School of Medicine

University of California, San Francisco

521 Parnassus Ave Box 0410

San Francisco, CA 94143-0410

(W) 415 514 2281

(F) 415 514 0468

Lee A. Learman, MD, PhD

Clarence E. Ehrlich Professor & Chair

Department of Obstetrics and Gynecology

Indiana University School of Medicine

(W) 317-948-8609

(F) 317-948-7417

Table of Contents

1. Background 4

2. Development 5

3. Psychometric Evidence 7

4. Rater Training 8

5. References 10

6. Appendix A: Reflective Ability Scoring Rubric 12

7. Appendix B: Training Examples 13

1. Background

Over the last decade, numerous organizations (e.g. UK General Medical Council (General Medical Council, October 23, 2009); CanMEDS (Frank JR, October 29, 2009); ABIM (ABIM Foundation et al., 2002; ABIM Foundation); ACP-ASIM (ACP-ASIM Foundation, & European Federation of Internal Medicine, 2002); ACGME (Accreditation Council for Graduate Medical Education, 1999)) have called for incorporating reflection and reflective activities into all levels of medical education. In response to this mandate, educators and accrediting bodies have required trainees and practicing professionals to complete reflective activities, most of which have taken the form of written exercises such as critical incident reports, journals, or responses to structured questions (Wald, Davis, Reis, Monroe, & Borkan, 2009). Reflection, the process of analyzing, reconsidering and questioning experiences and of making an assessment for the purposes of learning, is considered an essential skill for self-directed learning and professional development. It transforms experience into education by helping practitioners identify gaps in their knowledge and skills and by promoting critical reasoning, self-assessment, problem-solving and professionalism (Boud & Walker, 1998; K. Mann, Gordon, & Macleod, 2007). Recent studies suggest reflection may decrease diagnostic errors and improve clinical performance in complex or uncertain situations (Mamede, Schmidt & Penaforte, 2008). Studies of reflection in medical education generally focus on identifying common themes or responding to content issues.

In order to develop and assess reflective competence in trainees and practicing professionals, a valid measure of reflective skill is required. Few assessment methods have been developed and those that have are often either qualitative, and hard to generalize (Wald et al., 2009), or cumbersome and requiring extensive rater training (Pee, Woodman, Fry, & Davenport, 2002). Most have studied inter-rater reliability rather than other validity evidence.

Our objective was to develop and provide psychometric evidence for a rubric that could be used by others to describe the learner’s level of reflective ability as determined from a written essay stimulated by a prompt. Prompts vary but generally provide a topic or focus for the reflection, such as “a patient who taught you the most about treating a patient with dignity and respect” or “Reflect on a recent clinical or other professional situation where you made a mistake or didn’t have the necessary knowledge or skills.”

2. Development

The reflective ability scoring rubric was adapted from work done in the Centre for Medical Education at the University of Dundee (MH Davis, personal communication), where educators have developed a scoring schema for reflection when considering an entire learning portfolio. In our case, we adapted their three level schema to a six point rubric and applied it to individual reflections:

Describes procedure/case/setting without mention of lessons learned;
States opinions about lessons learned unsupported by examples;
Superficial justification of lessons learned citing only one’s own perspective;
Reasoned discussion well-supported with examples regarding challenges, techniques and lessons learned and includes obtaining feedback from others or other sources;
Analyzes the influence of past experience on current behavior; and
Integrates all of the above to draw conclusions about learning, provides strategies for the future learning or behavior and indicates evidence for determining effectiveness of those strategies.

A score of zero was given when the exercise was turned in with no description of a relevant learning event.

The items for the rubric reflect the components of reflection drawn from the literature (Boud & Walker, 1998; Hatton & Smith, 1995; Mezirow, 1998; Moon, 2004; Schön, 1983). The rubric is general but each higher level score assumes the level below. Half point scores are allowed. While various types of rubrics can be developed to allow the rating of independent elements, we have chosen a single dimension construct, “reflective ability,” and have built our rubric following theoretical guidance. This approach is one commonly used in rubric scoring and takes a holistic approach rather than an analytic one. While the issue of scoring may merit an alternative approach, this approach essentially requires incorporation of increasing number of elements as demonstration of reflective ability rather than being rewarded for the ability to do several aspects well but failing to address the full range of skills needed.

Our guiding principles included the following:

Guiding Principles and Definitions:
1. Focus on reflection: response goes beyond a detailed and colorful description of the event itself
2. Holistic rubric: gestalt based on reading entire entry, and then matching performance to scoring guidelines
3. Score given according to preponderance of reflection skill, and not just weak evidence of a higher level of performance; scoring commensurate with “spirit” of the level and not each specific detail
4. Score given for reflection on the action under consideration; saying that one reflected during the experience not sufficient reflection
5. "Reflection on action" defined as looking back upon performance to identify lessons learned about own behavior
6. "Reflection in action" defined as mindfulness of the situation and responding in the moment

The scoring rubric giving brief examples is provided in Appendix A.

3. Psychometric Evidence

We have done several studies generating psychometric evidence. Learman, Autry and O’Sullivan (,Learman et al, 2007; Learman et al, 2008) provided validity evidence using the rubric to study reflective ability in 32 OB/GYN residents. The residents completed 6 exercises that were scored from 0 (no description of event) to 6 (deep reflection) using the rubric. Residents completed 183 reflections. Inter-rater reliability with two trained raters was 0.89. Five exercises had adequate internal consistency reliability as a set (0.62); systems-based practice did not correlate with the other five. Senior residents received higher reflection scores than junior residents; the magnitude of difference was not statistically significant and was similar for other competency measures such as in-training examination performance and ratings by medical students and nurses. Reflection scores were correlated with professionalism and communication skill assessments (0.36-0.37, p<0.01) but not with medical knowledge.

Aronson (Aronson, Robertson, Lindow, & O'Sullivan, 2009) studied third year medical students who were given a prompt on professionalism about which they were to write a reflection. The control group (n=37) received no further instruction and the experimental group (n=78) received guidelines about reflection. The experimental group scored significantly higher in reflective ability as measured by the rubric (3.6 (sd=1.2) vs 2. (sd=0.8), p<.001, effect size = 1.25). Students with instruction should score higher and the rubric detected this difference which is evidence of its validity.

Using generalizability theory in these studies we determined that we could consistently obtain a reliability exceeding 0.85 with two raters using this rubric. In the course of these analyses, we trained six different raters who scored in various combinations for different studies, with a total of two or three raters in any one study. Based on this evidence, we would anticipate that with two trained raters reliability would remain above 0.8.

4. Rater Training

The training process requires several steps. Appendix A provides examples that can be used to train raters. These reflections were generated from our experience with actual examples and have been contrived for the purposes of providing these training materials. First, the review team must have a discussion of the rubric. Review the guiding principles and then each level. Discuss interpretations of each level and how one differs from the preceding and succeeding level. As a group, the raters then discuss one of the supplied examples and elaborate what score they would give and why and how that compares to the decision provided along with the example. Continue this process until there is comfort in applying the rubric. At that point, have raters rate five examples and calculate the agreement with the provided scores. If agreement is less than .8, review and discuss. The rater reliability can also be calculated using interclass correlation coefficients or generalizability statistics. A reliability of at least 0.8 with two raters is required. In our experience, this process is less successful if attempted as a single training session than if spread over two in-person sessions, one to achieve comfort with the rubric and a second meeting a week or two later to address additional questions which arise as raters score reflections independently. In general, no more than 15 examples are needed to prepare raters to consistently use the rubric. It is also advisable initially to calculate reliability at regular intervals (every 20 or 50 reflections) to assess for drift and to create opportunities for raters to discuss any reflections they found particularly challenging to score.

5. References

ABIM Foundation, ACP-ASIM Foundation, & European Federation of Internal Medicine. (2002). Medical professionalism in the new millennium: A physician charter. Obstetrics and Gynecology, 100(1), 170-172.

Accreditation Council for Graduate Medical Education. (1999). Outcomes project. Available at http://www.acgme.org/outcome/Comp/compFull.asp. Accessed 1/12/2010.

Aronson, L., Robertson, P., Lindow, J., & O'Sullivan, P. (2009). Guidelines for reflective writing produce higher quality reflections. AAMC Research in Medical Education presentation, AAMC annual meeting, November.

Boud, D., & Walker, D. (1998). Promoting reflection in professional courses: The challenge of context. Studies in Higher Education, 23(2), 191-206.

Frank JR. (October 29, 2009). The CanMEDS 2005 physician competency framework.

General Medical Council. (October 23, 2009). Tomorrow’s doctors.

Hatton, N., & Smith, D. (1995). Reflection in teacher education: Towards definition and implementation. Teaching and Teacher Education, 11(1), 33-49.

Learman LA, O’Sullivan P. Resident physicians’ ability to reflect. Chicago: American Educational Research Association Annual Meeting, 2007. Contact: Lee Learman

Learman LA, Autry AM, O'Sullivan P. (2008). Reliability and validity of reflection exercises for obstetrics and gynecology residents. Am J Obstet Gynecol,198(4), 461.e1-8.

Mamede S, Schmidt HG, Penaforte JC. (2008) Effects of reflective practice on the accuracy of medical diagnoses. Medical Education, May;42(5):468-75

Mann, K., Gordon, J., & Macleod, A. (2007). Reflection and reflective practice in health professions education: A systematic review. Advances in Health Sciences Education : Theory and Practice, doi:10.1007/s10459-007-9090-2

Mann, K. V. (2008). Reflection: Understanding its influence on practice. Medical Education, 42(5), 449-451. doi:10.1111/j.1365-2923.2008.03081.x

Mezirow, J. (1998). On critical reflection. Adult Education Quarterly, 48(3), 185-198.

Moon, J. A. (2004). A handbook of reflective and experiential learning : Theory and practice. London; New York: RoutledgeFalmer.

Pee, B., Woodman, T., Fry, H., & Davenport, E. S. (2002). Appraising and assessing reflection in students' writing on a structured worksheet. Medical Education, 36(6), 575-585.

Schön, D. A. (1983). The reflective practitioner : How professionals think in action. New York: Basic Books.

Wald, H. S., Davis, S. W., Reis, S. P., Monroe, A. D., & Borkan, J. M. (2009). Reflecting on reflections: Enhancement of medical education curriculum with structured field notes and guided feedback. Academic Medicine : Journal of the Association of American Medical Colleges, 84(7), 830-837. doi:10.1097/ACM.0b013e3181a8592f

6. Appendix A: Reflective Ability Scoring Rubric

7. Appendix B: Training Examples

The following pages provide 20 examples with scores ranging from 0 to 6. These examples are derived from real examples. In selecting them, we have tried to preserve the range of responses, approaches and writing styles (including errors) we encounter but also to sufficiently modified them so they do not represent unique or identifiable individuals or situations. While the boxed guidance for Rubric scores 0 and 1 mention prompts, the prompts are not specified since this approach has been used with a wide variety of prompts including “Select a clinical situation during this rotation that taught you the most about demonstrating integrity, respect and responsiveness to the needs of the patient above your own”, “Critically reflect on your community engagement experience” and “Reflect on a recent clinical or other professional situation where you made a mistake or didn’t have the necessary knowledge or skills.” Although these examples are presented from lowest to highest scores, they should be chosen at random for training purposes so as not to bias the scoring.

Rubric Score 0

Applying the guiding principles and rubric, these examples scored ‘0’ since they did not follow the prompt and/or talked in great generality about broad issues.

Example 1:

Note: I chose this as a topic to focus on, because I found a general “reflection” too broad, and needed a way to focus my thoughts.

I do not underestimate the challenge of teaching medicine, and I appreciate how hard it is to meet everyone’s individual needs, and to connect with people with such different backgrounds and interests. I have found the third year of medical school in regards to teaching and mentorship to be a huge disappointment.

For example, on my first clerkship I didn’t have continuity with any attending for longer than two weeks. The only sort of continuity I had was with one resident for one month. Given these brief periods of time, I don’t see how you can learn a field like medicine that relies on an apprenticeship model, with this reality. Once you find a good teacher you don’t get any time to establish a relationship with them