Epidemiology and Biostatistics Basics for Critical Appraisal of Medical Literature

Epidemiology and Biostatistics Basics for Critical Appraisal of Medical Literature

1

Epidemiology and Biostatistics Basics for Critical Appraisal of Medical Literature

Michael Aldous, MD, MPH

I. Definition. Epidemiology is "the study of the occurrence and distribution of diseases and other health-related conditions in populations." (Kelsey, 1986).

II. Measures of disease frequency

Incidence rate (also incidence density, or incidence). The rate of new disease onsets over time. Reflects proportion of persons affected and time to onset. Calculation: the number of new disease onsets divided by the sum of person-time under observation

Cumulative incidence. "The proportion of a fixed population that becomes diseased in a stated period of time" (Rothman, 1986). Can be interpreted as the average risk of acquiring disease during the specified interval. It is a dimensionless probability or proportion, and is very commonly reported simply as the proportion of subjects experiencing an outcome (i.e., without the term “cumulative incidence”).

Prevalence. The proportion of a population affected by disease at a given point in time. Emphasizes disease status, without reference to onset of disease. The best measure for diseases whose onset is gradual, making incidence calculation difficult, and for chronic diseases.

III. Comparisons

Epidemiology almost always involves comparisons between sample groups representing different populations. The comparability of the groups is fundamental to study validity.

IV. Measures of effect

Measures of effect evaluate associations between exposures and outcomes. Exposures include any characteristic which might affect the risk of disease. Outcomes includes onset of disease, death, severity of disease, etc.

Absolute effects. Useful for assessing the societal burden of disease. The incidence difference (rate difference) is the arithmetic difference in incidence between the exposed and unexposed groups. The risk difference is the difference in cumulative incidence or prevalence. Risk difference is also known as absolute risk reduction.If an association is thought to be causal, one can calculate the number needed to treat (NNT)as the reciprocal of the risk difference:

NNT = 1/(Risk Difference)

NNT is the number of patients one would need to treat in order to have one additional patient experience a good outcome (e.g., one less death).

Relative measures express the disease frequency among exposed subjects divided by that among the non-exposed. They are often reported in clinical and epidemiologic studies and are sometimes preferred for inferring etiology. Relative measures include the rate ratio for incidence rates, risk ratio (or relative risk) for cumulative incidence or prevalence, and odds ratio. The odds of disease are the probability that disease is present divided by the probability that disease is absent. For rare diseases (less than 5% of subjects), the odds ratio approximates the relative risk. Odds ratios are used especially in case-control studies (where relative risk cannot be calculated), and in logistic regression analyses (which yield odds ratios directly).

V. Hypothesis testing and effect estimation

Hypothesis testing. A question of scientific interest can be formulated into an alternative hypothesis, which can never be directly proven, and a contradictory null hypothesis, which, if rejected on the basis of observed data, lends support to the alternative.

The null hypothesis proposes that exposure is unrelated to outcome.

The alternative hypothesis proposes that exposure affects the risk of developing the outcome.

P-value is the probability, if the null hypothesis is true, of observing, purely by chance, an effect as great as or greater than that found in the data.

Type I (alpha) error occurs when there is no true association between exposure and outcome, but the observed data yield (purely by chance) a P-value less than the cutoff for "statistical significance," and the investigator concludes that an effect is present.

Type II (beta) error occurs when there is a causative relationship between exposure and outcome, but the sample size is inadequate to demonstrate a "significant" difference, and it is concluded that no relationship is present.

Effect estimation differs from hypothesis testing in that a dichotomous decision ("significant" versus "not significant") is not forced. The magnitude of an effect is estimated by calculation of a point estimate (risk ratio, odds ratio, etc.), and its precision is expressed as a confidence interval with an arbitrary level of confidence (often 95%). A confidence interval may be interpreted as the range of values within which the true effect probably lies, although this is not strictly true in a statistical sense. Both the magnitude and precision of an effect estimate are considered in interpreting study results (see Users' Guide, II.B.).

VI. Validity

Type I and type II errors are random errors, related to chance and probability. Systematic error (bias) results from improper study design. Validity is the lack of bias. A valid result is one that accurately represents the state of nature.

Selection bias. The process by which subjects are selected results in non-comparable groups.

Misclassification. Inaccurate measurement or categorization of variables (e.g., subjects with disease may be more likely than healthy controls to recall an exposure, resulting in a spurious association [recall bias]).

Confounding. A bias resulting from unequal distribution among exposure groups of a risk factor for the outcome (e.g., if males and females have different smoking rates, a difference in coronary heart disease due to smoking might erroneously be attributed to gender).

VII. Study designs

Randomized trial. Subjects are taken from a single population, assigned to a treatment group by the investigator using a random process, then observed, and outcome is measured. Study groups thus approach the ideal of being alike in every way except for the exposure of interest. This is the strongest design for evaluating therapeutic or preventive measures, but is expensive and not always ethically feasible.

Cohort (follow-up) study. Subjects are grouped according to (naturally occurring) exposure status, observed over time, and outcome is compared. Exposure status and potential confounders can be measured prospectively to minimize misclassification. Cohort studies are expensive, and inefficient for rare outcomes, because many subjects must be followed over time.

Case-control study. Subjects are identified according to disease status, then previous exposure status is determined retrospectively. This design is efficient for rare outcomes. However, recall bias threatens validity, as does the difficulty of selecting cases and controls from comparable populations. As with cross-sectional studies, it can be unclear whether exposure precedes outcome (see Section VIII).

Cross-sectional study. Subjects are sampled without regard to exposure or disease status, then disease prevalence is compared according to exposure status.

Ecological study. The unit of analysis is a group rather than an individual. This is a much weaker design, useful primarily for generating ideas for further study.

VIII. Assessing cause and effect. The following factors should be considered:

Temporal sequence. Cause must precede effect. This is the only sine qua non.

Strength of association. A strong association tends to be more convincing than a weak one.

Consistency. A relationship found in multiple studies of different populations is more likely to be causal; however, some effects are present only in specific circumstances.

Dose-response. A strong dose-response relationship suggests biologic causation but may be absent if the doses studied all elicit the maximum response or are below the effective threshold.

Confounding. Can any confounding factor account for the observed effect (see above)?

Plausibility. Does the effect agree with what is known about the biologic process?

IX. Approach to reading the literature

A. Initial screening of journal articles

Title. Interesting or useful? If not, reject.

Abstract. If the results are valid, are they useful to me? If not, reject.

Site. Are results from this site applicable to my practice? If not, reject. If so, continue.

B. Evaluation of a new therapy or preventive measure (see Users' Guides. II.A. & II.B.).

Primary guides to validity:

 Was treatment assignment randomized?

 Were all patients accounted for at the end of the study?

Was follow-up complete?

Were they analyzed in the groups to which they were randomized?

Interpretation of the results:

 How large was the effect?

 How precise was the effect?

 Can results be applied to my patients?

 Were all clinically important outcomes considered?

 Are the likely benefits worth potential harms and costs (consider number needed to treat for each)?

C. Evaluation of a new diagnostic or screening test (see Users' Guides. III.A. & III.B.).

Primary guides to validity:

 Was there a blind comparison with an acceptable reference standard?

 Was the spectrum of patients typical of usual clinical practice?

Interpretation of the results:

 Do the data allow calculation of sensitivity and specificity or likelihood ratios?

 Will reproducibility be satisfactory in my setting?

 Are the results applicable to my patient?

 Will the results change my management?

Will patients be better off if I use the new test?

D. Evaluation of a study about harm (see Users' Guides. IV.)

Primary guides to validity:

Were there clearly defined comparison groups that were similar with respect to determinants of outcome?

Were outcomes and exposures measured in the same way for each group?

Interpretation of the results:

How strong is the association?

How precise is the estimate?

Are the results applicable to my practice?

What is the magnitude of the risk?

Should I attempt to stop the exposure?

E. Evaluation of a study on prognosis (see Users' Guides. V.).

Primary guides to validity:

Was there a representative and well-defined sample of patients at a similar point in the course of disease?

Was follow-up sufficiently long and complete?

Interpretation of the results:

How large is the likelihood of the outcome in a specified time-period?

Were the patients similar to my own?

Will the results lead to changes in therapy?

Are the results useful for counseling or reassuring my patients?

X. Rule of 3 for interpreting zero numerators (VERY USEFUL!)

If no adverse events were observed among n subjects, the probability of a given adverse outcome, within 95% confidence limits, is no more than 3/n (see Hanley, 1983).

XI. References

Rothman KJ. Modern Epidemiology. Boston: Little, Brown and Company, 1986.

Hanley JA, Lippman-Hand A. If nothing goes wrong, is everything all right?: Interpreting zero numerators. JAMA 1983;249:1743-1745.

Angell M. The interpretation of epidemiologic studies. New Engl J Med 1990;323:823-5.

MoyerVA. Confusing conclusions and the clinician: an approach to evaluating case-control studies. J Pediatr 1994;124:671-4.

Selected Users' Guides to the Medical Literature ("The JAMA Series")

Oxman AD, Sackett DL, Guyatt GH et al. Users' guides to the medical literature. l. How to get started. JAMA 1993;270:2093-5.

Guyatt GH, Sackett DL, Cook DJ et al. Users' guides to the medical literature. ll. How to use an article about therapy or prevention. A. Are the results of the study valid? JAMA 1993;270:2598-2601.

Guyatt GH, Sackett DL, Cook DJ et al. Users' guides to the medical literature. ll. How to use an article about therapy or prevention. B. What were the results and will they help me in caring for my patients? JAMA 1994;271:59-63.

Jaeschke R, Guyatt G, Sackett DL et al. Users' guides to the medical literature. lll. How to use an article about a diagnostic test. A. Are the results of the study valid? JAMA 1994;271:389-91.

Jaeschke R, Guyatt GH, Sackett DL et al. Users' guides to the medical literature. lll. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? JAMA 1994;271:703-07.

Levine M, Walter S, Lee H et al. Users' guides to the medical literature. IV. How to use an article about harm. JAMA 1994;271:1615-9.

Laupacis A, Wells G, Richardson WS et al. Users' guides to the medical literature. V. How to use an article about prognosis. JAMA 1994;272:234-7.

Oxman AD, Cook DJ, Guyatt GH et al. Users' guides to the medical literature. VI. How to use an overview. JAMA 1994;272:1367-71.