APPENDIX

Standards for Medical Education Submissions to the Journal of General Internal Medicine

Approved by JGIM Medical Education Deputy Editors

Revision date May 2008

A. Standards applicable to all manuscripts

All manuscripts must meet the broad standard of "scholarship" as defined by Boyer1 and operationalized by Glassick.2, 3

1. Quality questions
Is the research question clear? /
  • A research question (statement of study intent) can be framed as a question, study purpose, objective, specific aim, or hypothesis.
  • Clear statements of study intent for most quantitative studies will include:4
  • The population,
  • The independent variable,
  • The dependent variable, and
  • The relation between these variables (i.e., intent to seek an association or a difference).
  • Study goals and educational objectives often differ.

Is the question important or meaningful? / "Importance" is best supported through the use of a conceptual framework.5, 6
  • A conceptual framework situates the study goals, study design, or instructional methods within a model or theoretical framework that facilitates interpretation of results, application to new settings, and conduct of future research.
  • While these may take the form of formal theories, more often they are models or approaches.

Does this study build on what has already been done? /
  • Scholarly work must build upon prior knowledge and experience. This requires a concise but thorough literature review that highlights strengths and gaps in previous research
  • The literature review should culminate in a "problem statement" that highlights the gap in past work that this study will fill. This leads directly to the study goal/purpose/hypothesis.
  • Development of the conceptual framework (see item 1) goes hand-in-hand with the literature review – the conceptual framework defines the scope of the review, and the review establishes and clarifies the framework.

2. Methods to match the question
Are methods appropriate for the research question?
Are critical decisions in methods justified? / Justification can be logistical ("it was not feasible to randomize"), logical ("after careful consideration of various options, we decided to … because …"), or supported by literature.
Are interventions (if any) grounded in previous empiric or theoretical work?
Are outcomes studied appropriate and meaningful for the study goals and/or educational objectives? /
  • Outcomes studied should match both the educational objectives and the study goals.
  • "Higher-order" outcomes (outcomes of behavior in practice or patient outcomes) are desirable, but are not yet the standard.
  • Objective outcomes are preferred over subjective (self-assessed) outcomes.

Are methods clearly defined and discussed? / This should include, as a minimum:
  • Study design
  • Setting
  • Participants (population and sampling)
  • Educational intervention (if any) and control intervention (if any) described in sufficient detail for replication
  • Outcome measures, including validity evidence
  • Statistical analyses including sample size
  • Ethical considerations

Are appropriate statistical tests employed? / Authors should:
  • Be explicit about what statistical tests are used for which analyses
  • Adjust for multiple independent comparisons7when necessary
  • Report confidence intervals and effect sizes rather than p values alone
  • Verify the assumptions of statistical tests used.
  • For example, most survey data (including Likert-type scales) are ordinal and thus fail to strictly meet the assumptions for parametric tests (i.e., t-tests or ANOVA). Authors have two choices: a) use strictly nonparametric methods (chi-square, Wilcoxon, Mann-Whitney, Kruskal-Wallis), or b) Explicitly state that assumptions for parametric model were explored and adequately met

3. Insightful interpretation
Is there evidence of reflective critique? / A reflective scholar will
  • Identify shortcomings
  • Identify strengths
  • Situate the work in the context of prior studies
  • Identify immediate applications and directions for future research

What does this study add to the field? / Avoid overstating the scope or importance of the study, but emphasize appropriate implications.
Is this a niche topic (and if so, is it appropriate for JGIM readers)?
4. Transparent reporting
Is writing clear and concise? Was proper grammar employed? / Authors should avoid jargon (including terms and acronyms developed by and unique to the investigator team) and employ a vocabulary that facilitates shared meaning and understanding across a broad readership.
Are quantitative data clearly reported? / Actual quantitative data, not just summaries, should be reported.
  • Continuous variables: means and standard deviations or standard errors
  • Categorical variables: numerator and denominator (not just percentages)
  • Report confidence intervals not just p values; consider reporting effect sizes.
  • Do not use "trend" when describing a result that does not reach statistical significance.

Are qualitative data clearly reported? / Qualitative research should report specific themes along with supporting quotations and excerpts.
Are data reported in the Abstract? / Actual data, not just summaries, should be reported in the abstract
  • For quantitative research, means (and standard deviations or standard errors) or percentages should be reported, along with confidence intervals and p values as appropriate.
  • For qualitative studies, data refers to specific themes identified and concise summaries of supporting evidence.

5. Attention to ethical issues
Was appropriate attention paid to the protection of human subjects? / A statement regarding informed consent and IRB review is required for all studies involving human subjects, including students (see JGIM Instructions to Authors).
  • IRB review (approval or exemption, depending on the study) is required in nearly all cases.
  • Informed consent requirements for education research vary for different institutions and different study designs and thus is not required in all cases. However, investigators should be sure that study participants confidentiality and autonomy are protected.

Did authors adhere to standards of scientific integrity? / JGIM strongly opposes ghostwriting,8 honorary authorship, undisclosed conflicts of interest,9 duplicate publication, and plagiarism.
Does this submission reflect a reasonably comprehensive report of the study? / Splitting research results into the "smallest publishable unit" (informally called salami slicing10) in order to maximize the number of publications from a single study is inappropriate.

B. Standards for specific study types

Educational innovations
  • "Educational Innovations" are "succinct descriptions of innovative approaches to improving medical education."
  • The key to a successful "Education Innovation" publication is a novel, well-described idea that addresses an important need and builds on prior work.

Is this innovative? /
  • There must be evidence of a diligent search for similar or relevant prior work.
  • Ideas that have already been described are not "Educational Innovations" but may qualify for publication as "Brief Reports" or "Original Research" if they improve or build on previous research.
  • Even new ideas require reference to prior work. Scholarly innovations do not appear from thin air; they build on prior work.Even when an idea has never been previously described, a diligent search will invariably identify previous work (empiric and theoretical) to support the approach followed.

Is the innovation clearly described? /
  • Authors must describe the innovation, including both the educational objectives and the innovation itself, sufficiently well that a reader could implement the innovation at his/her own institution.
  • Both the educational objectives and the specifics of the innovation should be described.

Is the evaluation appropriate? /
  • It is not required that all innovation manuscripts report an evaluation, but it helps immensely. Only the most innovative ideas can get by without an adequate evaluation.
  • As the degree of innovation goes down, the rigor of the evaluation should go up.

Is there evidence of reflective critique? /
  • What went well?
  • What did not work as planned? (Authors are not penalized for honesty and candor.)
  • How and why do results vary from previous research?
  • What areas are identified for improvement and/or further research?

Does it adhere to JGIM "Instructions for Authors"? /
  • Recommended headings for abstract and body
  • Succinct (<2000 words)

Survey research
Is the research question clearly stated and justified? / Justification is best accomplished by a thorough literature review.
Is a survey an appropriate way to answer this research question?
Is the study sample reasonably representative of the target population? /
  • Is the target population defined? (to be published in a peer-reviewed journal the population must be fairly broad i.e. a national scope)
  • Single-institution studies are typically less desirable, but in certain cases (e.g. large sample likely to be viewed as similar to a group at another institution) this may be acceptable.

Is there evidence to support the "plausible validity" of instrument scores? /
  • Validity evidence should be concisely summarized and include as a minimum content evidence and reliability data (see "Development and Evaluation of Assessment Tools").
  • Instrument items should adhere to current standards (for example, multiple-choice questions should adhere to guidelines11).

Are methods described in sufficient detail? /
  • How was survey administered (mail, Web, phone, other)?
  • What methods were used to encourage follow-up?
  • What data analyses were planned, and how were these targeted to specific questions?

Did authors adequately guard against researcher bias? /
  • Researchers can (intentionally or unintentionally) bias the questionnaire to favor their (conscious or unconscious) interests.
  • Rigorous instrument development methods and/or use of existing instruments can help reduce this bias.
  • Outcome reporting bias occurs when researchers conduct multiple analyses and then report only those that are significant or interesting. Options to minimize this include:
  • State the research questions in advance.
  • Pre-plan all analyses to focus directly on the research questions, and conduct only planned analyses.
  • List all analyses conducted in the Methods and/or Results (including those whose results are not reported).
  • Adjust for multiple comparisons using omnibus tests or Bonferroni's adjustment.7

Do the Results and Discussion focus on key points? / Results and Discussion should highlight key points that support a clear message.
Is the response rate adequate? / Low response rates leave open the possibility of bias.
  • There is no universal definition of adequate, but authors should consider the response rate as they interpret results.
  • If response rate is low, it is important to describe the differences between responders and non-responders.

Do authors report a sample of instrument items? /
  • It is generally useful to see the exact questions, either
  • In a table (reporting questions and results in the same table), OR
  • As an appendix.
  • It is not usually necessary to publish the actual instrument. It saves space to list the questions as text (i.e. in a table).
  • If all questions are not provided, authors should report at least a few examples of typical questions.

Needs analyses
  • A "needs analysis" is intended to identify the current state of a specific medical education issue.
  • These frequently address potential deficiencies in student knowledge/skills or curricular content, but can also explore other educational "gaps" such as work hour violations, inequities in academic promotion, or resident well-being.

Is the research question clearly stated and justified?
Is the study sample reasonably representative of the target population? /
  • A knowledge deficiency at one institution does not indicate a national need.
  • Single-institution studies will meet with skepticism.

Is there evidence to support the "plausible validity" of instrument scores? /
  • See Development and Evaluation of Assessment Tools

Were appropriate methods used? / Methods might include surveys, tests, focus groups, chart audits, task analysis, and review of published and unpublished documents (many other methods are possible).
Do the Results and Discussion focus on key points? / Results and Discussion should highlight key points that support a clear message.
Do authors report a sample of instrument items? / See guidelines for Survey Research for details.
Development and evaluation of assessment tools
The central theme to these studies concerns the validity of the instrument's scores for the intended purpose: "Is it plausible that the scores are measuring what I want them to measure?"
Do authors present a variety of evidence to support score validity? /
  • Content – how well does the instrument match the intended construct domain?
  • Response process – how do idiosyncrasies of the actual responses affect scores?
  • Internal structure – these are typically psychometric data (reliability, factor analysis)
  • Relations to other variables – how do scores relate to other variables that purport to measure a similar or different construct?
  • This includes previous concepts of concurrent, predictive, and construct validity.
  • Consequences – do the scores make a difference (and is it an intended or unintended effect)?
  • An example of consequences evidence in clinical research would be a study investigating chest x-ray screening for lung cancer and looking at mortality rates (does the assessment [chest x-ray] make a difference?).

Have authors presented a convincing validity argument? / The validity argument may be constructed as follows:
  1. State the initial "hypothesis" about what the scores should reflect.
  2. Plan and execute studies to collect evidence to support or refute that hypothesis (ideally, test the weakest assumption), and analyze data.
  3. Revise the hypothesis (either the instrument, the construct, or the appropriate context of application) if needed.
  4. Repeat steps 2 and 3 until evidence "sufficient" to support (or reject) the validity argument has been collected. "Sufficient" will vary depending on the application.

Is it plausible that the scores measure what the investigators purport them to measure? / Any validity study should present data from several complementary sources of evidence.12
  • Ideally, the evidence will address the most critical or questionable aspects of the validity argument.
  • It is not needed for studies to present evidence from all five sources.

Notes on terminology / We are not so much interested in terminology as we are in authors demonstrating an understanding and appropriate application of the concepts (particularly the validity hypothesis argument) embodied in the above. However, common terminology often helps with clear communication of ideas.
  • Validity refers to how well an instrument's scores reflect an intended underlying construct (e.g., knowledge, skill, attitude).
  • Validity is a property of an instrument's scores, not the instrument itself. It is inappropriate to speak of valid instruments, but rather valid scores.
  • Face validity is not a useful concept.13, 14 When authors talk about face validity, sometimes (but not always) they are alluding to content evidence.

Sources for additional information / We refer authors other sources14-17 for more information on validity and examples of how validity data might be collected for new and existing instruments.
Evaluation studies
Was there a need for this intervention? / A need can be demonstrated based on a new or published needs analysis, a literature review, or a theory in need of applied testing, among other methods.
Are intervention objectives clearly stated? / Objectives should arise from the needs.
Is the intervention grounded in theory and/or empiric evidence? / The intervention should build on prior work.
Is the intervention aligned with the objectives?
In the evaluation aligned with the educational objectives and the study goals? /
  • Randomized trials are not required, but authors should consider relevant validity threats.18, 19
  • Outcomes studied should match both the educational intervention and the study goals.

Are other standards of quality met? / Consider consulting published guidelines.20-24
Qualitative research
Is there a focused question?
Are appropriate sampling and data collection methodologies used?
Do the inductive analytic methods promote trustworthiness, credibility, dependability, and transferability? / Some methods to this end include (authors need not employ all of these, and others may be appropriate)
  • Duplicate coding
  • Triangulation
  • Member checks
  • Saturation
  • Peer review

Do results demonstrate a clear logic of inquiry?
Are appropriate data presented in the Results? / Data include themes and supporting quotations/excerpts.
Sources for additional information / We refer investigators to published resources and standards.25-28
Systematic reviews
Did the review address a focused question?
Is there more than one author?
Are eligibility criteria clearly described?
Was the search thorough enough?
Did authors assess the quality of primary studies?
Sources for additional information / We refer investigators to existing published resources and standards.29-31

C. Resources

Design and conduct of education scholarship and research

Journal articles

  • Beckman TJ, Cook DA. Developing scholarly projects in education: a primer for medical teachers. Med Teach. 2007; 29:210-8.
  • Wilkes M, Bligh J. Evaluating educational interventions. BMJ. 1999; 318:1269-72.
  • Britten N. Making sense of qualitative research: a new series. Med Educ. 2005; 39:5-6. (and other articles in this series)
  • Carney PA, Nierenberg DW, Pipas CF, Brooks WB, Stukel TA, Keller AM. Educational Epidemiology: Applying Population-Based Design and Analytic Approaches to Study Medical Education. JAMA. 2004; 292:1044-50.
  • Cook DA, Bordage G, Schmidt HG. Description, Justification, and Clarification: A Framework for Classifying the Purposes of Research in Medical Education. Med Educ. 2008; 42:128-33.
  • Cook DA, Beckman TJ. Reflections on experimental research in medical education. Adv Health Sci Educ Theory Pract. 2008; Epub ahead of print 22 April 2008; doi 10.1007/s10459-008-9117-3.
  • Gerrity MS. Medical Education and Theory-Driven Research. J Gen Intern Med. 1994; 9:354-5.

Books

  • Cronbach LJ. Designing Evaluations of Educational and Social Problems. San Francisco: Jossey-Bass, 1982.
  • Fraenkel JR, WallenNE. How to design and evaluate research in education. New York, NY: McGraw-Hill, 2003.
  • Green JL, Camilli G, Elmore PB. Handbook of Complementary Methods in Education Research. Mahway, NJ: Lawrence Erlbaum, 2006.
  • Miles MB, Huberman AM. Qualitative data analysis: an expanded sourcebook. Thousand Oaks, CA: Sage, 1994.
  • Norman G, Van der Vleuten C, Newble D (eds). International Handbook of Research in Medical Education. Dordrecht: Kluwer Academic Publishers, 2002.

Validity, measurement, and statistics