Portfolios

Instrument description

Snadden(1) describes the portfolio as a collection of evidence that learning has taken place which in practiceincludes documentation of learning and progression, an articulation of what has been learned, and importantly a reflection on these learning events/experiences. Portfolios are used both as a learning tool to stimulate reflective, experiential and deep learning and as an assessment method to judge progression towards or achievement of specific learning objectives, competence and or fitness to practice.Depending on the specialised purpose of the portfolio, its content including evidence required, and assessment criteria vary from context to context. However, Swing emphasises that any portfolio used for assessment purposes should clearly articulate the amount, type and quality of evidence required to establish proof of competence and the marking criteria used to evaluate the quality of evidence(2).

Common portfolio documentation can include descriptive material and graded evidence including written reports,critical incident reports; samples of performance evaluations e.g. clinical tutor reports, videotapes/audiotapes of interaction with patients; audit material; clinical records of procedures undertaken; learning plans; and written reflection about the evidence provided and also in terms of identification of strengths, weaknesses, opportunities for improvementand professional growth(1, 3).Online technologies are being increasingly used to support portfolio use and development, and electronic portfolios (commonly referred to as e-portfolios) are becoming more popular due to their added flexibility(4-7).

Portfolio assessment is generally carried out using scoring rubrics, rating scales or checklists developed for a specific learning and assessment context.

Competency domain

A range of learning objectives/outcomes and competencies such as clinical performance and technical skills can be assessedusing the portfolio approach. However, portfolios may be best used for the assessment of competencies that are difficult to evaluate using other techniques e.g. communication, risk management, problem solving, response to feedback, decision making, response to ethical and professional dilemmas, patient advocacy, information and change management skills.

Competency level

As well structured portfolios contain evidence of learning and performance they can provide a measure of the ‘does’ stage of Miller’s pyramid of competence(8). However, due to predominantly qualitative nature of portfolio evidence, complications arise in trying to assess the quality of this evidence.

Assessment context

Portfolios are used extensively at undergraduate, postgraduate and continuing professional development levels in a range of health professional education settings including medical education(3, 9-15), nursing and midwifery(16-24), dental education(25-30) and allied health education(31, 32) to guide and monitor learning. However, most of the evaluative literature is concentrated in the medical education field, and to a lesser extent in nursing education.

At undergraduate and postgraduate levels portfolios are commonly used for formative assessment purposes to provide students with feedback regarding their academic and/or professional competencies and to plan for future learning, and less commonly for summative assessment purposes. However, Davis et al describe a summative portfolio assessment system used as part of the final year examination at the Dundee Medical School(33). Similarly, Rees & Sheard also document the summative portfolio assessment approach used with undergraduate medical students at the University of Nottingham(34, 35). Portfolio assessment has also been used widely with general practice(36) and specialist registrars(37) and residents(38, 39) for assessing knowledge and skills gained during training.Webb et al developed and evaluated a Surgical Learning and Instructional Portfolio (SLIP) system to guide the education and professional growth of surgical residents(40).

At the continuing education level portfolios are used for revalidation purposes where individuals are required to provide evidence for fitness for continuing practice. The portfolio approach facilitates presentation of evidence of clinical performance, reflection on practice and consequent change in practice. Portfolio learning and assessment has been used in the continuing education of general practitioners(41), to guide and judge the professional competence of general practice trainers(42-44), and also in a revalidation pilot scheme for general dental practitioners(28, 29).

Instrument Quality

The key challenge faced in portfolio assessment is in establishing psychometric qualities including reliability and validity of the method. Some propose that because portfolios are inherently reflective and qualitative in nature, standardisation of portfolio content and strict guidelines and assessment criteria cannot be applied without compromising the validity, authenticity and flexibility of portfolio assessment. These researcher suggest that qualitative measures such as credibility, dependability, transferability and confirmability be applied to the evaluation of portfolio assessment systems(45). Others argue that psychometric quality of portfolio assessment must be addressed and established if they are going to be used as part of high-stakes decision making(46).

Reported below are some of the studies that have investigated the psychometric properties and other qualities of this assessment method.

Reliability

In their study, Melville et al examined the reliability of the portfolio assessment of pediatric specialist registrars (SpRs), compared portfolios and records of in-training (RITA) assessments, and also investigated whether a change in portfolio quality occurred as a result of providing detailed feedback(37). Portfolios were assessed as part of registrars’ annual RITA review and were marked globally and along several domains including clinical skills, communication, ethics and attitudes, self learning and teaching, evaluation and creation of evidence, and management. Each domain was marked twice on a scale of 0-10 (where 0-4 was unsatisfactory and 9 and 10 was excellent), with marks given for both quality of items presented and quality of documentation.

Results of this study demonstrated a moderateto high inter-rater correlation of the portfolio global assessment (coefficient of 0.52 - Cohen's kappa 0.35) and of the RITA interview (coefficient was 0.71 - Cohen's kappa 0.38). Domain specific correlations were lower for both measures. There was moderate inter-assessment correlation between portfolios and RITA interviews (kappa 0.26 in year 1 and 0.29 in year 2). Generalisability data suggested that 5 successive ratings by a single observer or independent ratings by 4 observers on the same occasion would be needed for a generalisability coefficient > 0.8 (excellent) for overall portfolio rating. Results also showed that giving detailed feedback increased the quality of portfolio documentation and presentation from year 1 to 2 of the study, but no had no impact on portfolio scores. The authors concluded that based on the reliability data from this study, portfolio assessment should be triangulated with other measures. The authors recommend better assessor training and refinement of assessment protocols to increase assessment reliability.

Similarly, O’ Sullivan et al examinedthe reliabilityof portfolios in assessing knowledge and skills during psychiatric residency training(39). The portfolios consisted of 5 topic areas and skills related to curriculum outcomes and program goals, and portfolio entries were to consist of actual work carried out by the residents on their rotations. Evaluation of the portfolio assessment system was carried out using 3 different measures - resident performance as reflected by the portfolio score, clinical evaluation as reflected by faculty global evaluation scores, and psychiatric knowledge as measured by the psychiatric residency in Training Exam (PRITE). Generalisability analysis indicated that to reach a reliability coefficient of 0.7 (moderately high) 3 raters or 6 portfolio entries would be required. There were moderate inter-assessments correlations between portfolio scores and the PRITE measures, and only a low correlation between portfolios and clinical evaluation ratings. The authors note that low inter-rater correlations could be improved by the refinement of portfolio domain descriptors, better aligning the scoring rubric with curriculum outcomes, and by increasing rater training(39).

Intra-rater and inter-rater reliability of assessor evaluations of general practice trainer portfolios was also assessed by Pitts et al in their study where twelve GP trainer portfolios were assessed by eight raters(42). The portfolio assessment criteria included reflective learning, awareness of present state and willingness to learn, recognition of effective teaching behaviours, identifying with the learner, awareness of educational resources and drawing conclusions for the future. Assessors also provided a global rating. The portfolio assessment was carried out on 2 separate occasions by the same assessors. Findings of the study demonstratedmoderate intra-rater reliability, but poor inter-rater reliability (kappa coefficients ranged from 0.1 to 0.41). Replication of this study also indicated similar results(43), and the authors concluded that the portfolio approach was insufficient to make summative high stakes assessments in this context. However, the authors later demonstrated the value of discussion between assessors inincreasing inter-rater reliability(44). Paired decisions made by 4 pairs of assessors showed greater reliability than eight separate assessments made by 8 assessors.

Davis et al document the use of a summative portfolio assessment system at the Dundee Medical School(33). Portfolios were assembled in years 4 and 5 and consisted of patient presentations, case discussions, practice procedure cards, PHRO attachment learning plans and forms, theme assessment reports and forms, elective reports and 4th year assignment. Submitted portfolios were assessed for completeness, the extent they met curriculum outcomes, and also for students' strengths and weaknesses. Analysis of student results indicates low to moderate correlation (Spearman’s correlation of 0.34 to 0.47) between portfolio and final exam components. Although examiner evaluation questionnaires showed strong support for portfolio assessment and its ability to identify students’ strengths and weaknesses, students were more reserved about the benefit of the portfoliosystem and concerns were expressed about the amount of resources required to put the portfolio together and the variable standards marking that were perceived to be used. The authors conclude that the portfolio system provided a useful way of assessing outcomes not easily assessed by other methods, and that detailed examiner training and student briefing would add to the quality of the portfolios and their assessment.

The reliability of assessment criteria used in a summative portfolio assessment system at the University of Nottingham was evaluated by Rees & Sheard (35). This portfolio was used with 2nd year medical students and was intended to reflect their learning and performance of communication skills. Portfolio evidence consisted of a reflective commentary, personal reflection forms, and peer and teacher observation checklists. As part of this study 100 portfolios were marked by two raters on several dimensions including the logic and coherence of portfolio structure, level of critical reflection, level of current or future skills development, use of evidence and use of relevant literature, and these judgements were converted into a global percentage score. The level of agreement between the two raters on the percentage scores was satisfactory (interclass correlation = 0.771) and the levels of agreement between raters on individual items ranged from fair (K=0.359) to substantial (K=0.693), with high agreement found for more objective dimensions such as use of documentary evidence and use of relevant literature. The authors note that the levels of agreement found in this study were higher than those reported elsewhere and may be due to several factors including the use of different assessment criteria to rate the portfolios and a fewer number of raters, and the increased frequency of discussion and negotiation between raters in this study. The results of this study showed that summative assessment of portfolios can be reliably carried out particularly when combined with discussion and negotiation between assessors to enhance inter-rater reliability.

Validity

Portfolio assessment is generally deemed to have high face and content validity due to the participation of staff and/or students in portfolio development and selection of relevant content for inclusion.

Gadbury-Amyot et al demonstrated the validity of a portfolio assessmentin a dental training program using Messick’s framework for the validation of performance(26). Content validity was supported via expert validation of the constructs, skills and content measured, and demonstrated via a significant correlation (p< .01) between each of the rubric subscales (Cronbach's alpha or internal consistency ranged from 0.81 to 0.95). Portfolio scores were also compared to traditional measures of competency including GPA scores and national and regional examination marks. There was a significant relationship between portfolios and GPA and examinationscores demonstrating moderate criterion or external validity. Findings of this study indicated the scoring rubric accounted for 78% of total variation in portfolio assessment, and that faculty raters were only a very minor source of variability. Authors indicate that a high generalisability coefficient (phi coefficient = .86) could be obtained by increasing the subscales of a portfolio scoring rubric to fourteen and decreasing faculty raters to three(26).

Driessen et al examined a portfolio assessment system at Maastricht University in order to determine which characteristics influenced raters’ assessment of undergraduate medical students’ portfolios(10). The reflective portfolio used in this context is based on students’ performance and outcomes in the roles of medical expert, health care professional, scholar and person, and is assessed summatively. Upon consultation with mentors and the literature, a Portfolio Quality Analysis Scoring Inventory was developed and used to assess 40 student portfolios on several dimensions. These dimensions include the quality of portfolio evidence, the quality of analysis of strengths and weaknesses and link with evidence, formulation and achievement of learning objectives. Results indicate that inter rater variance ranged from 0.46 (moderate agreement) to 0.87 (excellent agreement), and was significantly affected only by one variable – the perceived quality of reflection (p=0.000). The authors conclude that these results lend support to the validity of the portfolio as a method of global assessment of reflective competence. However, the authors note that study limitations include the small sample size and the focus on general reflective and not clinical competencies. Similarly, O’Sullivan et al established the validity of portfolios in measuring competence during psychiatric residency training based on the trend for higher portfolio scores depending on year of course(39).

Acceptability and Feasibility

The feasibility and acceptability of revalidation of general dental practitioners using portfolio assessmentis reported by Maidement et al(28, 29). 10 portfolios of evidence of fitness to practice were assessed by a panel of experts using criteria developed specifically for this context, and data regarding participants' and assessors’ perceptions of the revalidation portfolio and its assessment was gathered via interviews and focus groups, and via a survey. Data was also collected about the time spent on compiling the portfolio. Findings show that 8 out of 10 portfolios were found to be of a standard that a recommendation of revalidation was made. Both participants and assessors found the revalidation process using portfolios as generally acceptable and feasible, but requested more training and support in portfolio use and review.

Fung et al sought to establish the establish the feasibility of a national web-based ultrasound learning portfolio in resident education in obstetrics and gynecology(7). Data entered by 50 residents over a 3-year period was analysed to identify categories of ultrasound cases entered; critical incidents of learning; domain and stimulus of questions posed; and educational resources used for learning. Findings established support for the capacity of this portfolio system to facilitate reflection on learning experiences, by demonstrating that overall following their ultrasound learning encounter most residents asked questions of a cognitive nature (53.3%), and this was significantly associated with identification of a critical incidence of learning (P < 0.001).Additionally, questions of high volume users were most often stimulated by reflections and self-assessment (81.5%) when compared to low-volume users (61.3%)., and high-volume users were more likely to report a change in their practice as a result of learning acquired through the use of KOALA systems (P < 0.001). Participants identified several barriers to portfolio implementation included lack of computer and/or Internet access in the clinical environment, lack of complimentary faculty development initiatives, and failure to integrate the learning portfolio as a component of the resident evaluation process. Although the authors demonstrated the feasibility of implementing a web-based learning portfolio which facilitated data collection from residents in geographically dispersed areas and programs, a systematic evaluation of use of this portfolio as a resident assessment tool is recommended.

Lawson et al also assessed the usability and acceptability of an e-portfolio system implementedin agraduate certificate in health professional education by carrying out interviews and written evaluations with participants and tutors(47). Participants expressed mixed levels of confidence in using the technology and in using aspects of the portfolio due to unfamiliarity and/or technical difficulties. Another barrier cited was a lack of access to computers in the workplace. Tutors reported satisfaction with the easy electronic access to all student assessments, and the central repository of course information. Again, a systematic evaluation of this electronic portfolio as an assessment tool would be valuable and highly recommended.

Qualitative approach to portfolio assessment

It has been proposed that Lincoln & Guba’s framework of credibility and dependability is a more appropriate measure of the quality of portfolio assessment systems due to inherently reflective and qualitative nature of portfolios(48). This framework incorporates a number of strategies including triangulation of information, prolonged engagement by the portfolio assessor/mentor, consulting and testing the data with members of the group from which the data was collected, establishing a documentary audit trail, and conducting an audit check with an external auditor.Driessen et al have applied the qualitative framework described above to monitor a portfolio assessment systemat Maastricht University. The authors demonstrated that incorporating elements such as regular feedback cycles; prolonged involvement by the relevant resource persons, including the student, in the decision making process; staggered decision making process; comprehensive documentation of the decision making process and external quality assurance, added to the credibility and dependability of the portfolio assessment system used in this context(45).