Bias and Confounding

The function of epidemiology research is in part, to examine the relationship between exposure and outcome. To this effect various studies are conducted gathering data, analysing the data and interpreting the data.

Bias is any trend in the collection, analysis, interpretation, publication or review of data that can lead to conclusions that are systematically different from the truth. Also deviation of results or interference’s from the truth, or processes leading to such deviation are bias.

Bias can occur during any stage of a study:

  • during the literature review of the study question
  • during the selection of the study sample
  • during the measurement of exposure and outcome
  • during the analysis of data
  • during the interpretation of the analysis
  • during the publication of the results

Various forms of bias had been described and defined. Most of them however can be categorised in one of three general types:

  • Selection bias
  • Information bias
  • Confounding bias

Some biases are specific to a particular type of analytical study whereas others can be found in all basic study designs (cross-sectional, case control and cohort).

Selection bias

Selection bias can occurs in the design phase of studies. It may also occur during the execution of study when some subjects are included and not others, based on the procedures used to select subjects. Errors in the estimation of effect happens when characteristics of the subjects selected for the study are systematically different from those in the target population, a distortion of the measured effect will then result.

Many varieties of selection bias have been described. Admission bias, prevalence/incidence bias detection bias, volunteer bias and loss to follow-up bias are common forms of this type of bias. Admission bias occurs when case control and cross sectional studies are done exclusively in hospital settings where the population studied not accurately reflects the target population.

Prevalence/incidence bias happens when mild or asymptomatic cases as well as fatal short disease episodes are missed when studies are performed late in disease process. Volunteer bias occurs when those who volunteer to participate in a study differ systematically with regard to either exposure or disease status from those who did not volunteer.

The common element of such biases is that the relation between exposure and disease is different for those who participate in study and those who would be theoretically eligible for the study but do not participate.

Selection bias is a theoretical possibility whenever correlates of the outcome capable of influencing study participation are existent in some individuals at the beginning of the study. These correlates may be unmeasured or even unrecognised by the investigator.

Information (observation) bias

Information bias occurs in the data collection stage of studies. It happens when estimated effect is distorted either by an error in measurement or by misclassifying the subject for exposure and/or outcome variables.

The most common types of information bias include interviewer bias, questionnaire bias, recall bias, diagnostic suspicion bias and exposure suspicion bias.

Interviewer bias results when systematic differences occur in the soliciting, recording, or interpreting of information from study subjects. Questionnaire bias results when leading questions or other flaws in questionnaire result in a difference in accuracy between compared groups. Recall bias happens e.g. when people, having had adverse health outcomes, remember and report past exposure differently from those who did not experience any adverse health outcome.

Confounding bias (confounding)

Confounding is essentially a mixing of effects that occurs when a factor (confounder) associated with the exposure of interest is also associated with development of the disease or outcome of interest independently of exposure. Therefore, a distorted estimate of the exposure effect results because the exposure effect is mixed with the effect of extraneous variables.

A confounder must be predictive of disease occurrence independent of its association with the exposure of interest, but cannot be an intermediate in the casual chain of association between exposure and disease development. The confounding variable can effect the association between exposure and disease positively or negatively; the distorted estimate resulting from confounding can overestimate or underestimate the true effect or even change the apparent direction of effect.

To be confounding, and extraneous variable must have the following characteristics:

  • It must be a risk factor for disease
  • It must be associated with the exposure under study in the population studied
  • It must not be an intermediate step in the casual path between the exposure and the disease

In epidemiological studies an investigator would like to minimise both systematic error (bias) and random error (chance). Reducing systematic errors lead to an increase in the validity of the study, while reducing random errors increase the power of the study. Knowledge of systematic error (bias0 therefore become an important issue in epidemiological studies.

By careful use of proper technique in the design, data collection, and analysis stages bias can be prevented or minimised.

______

Hendrik Vermooten

TABLE 4
Prevention of Selection Bias Study Designs
Study Designs
Type of Selection Bias / Cross Sectional / Case-Control / Retrospective Cohort / Prospective Cohort
Berkson’s / 1. Avoid selecting subjects from hospitals / 1. Use population based case and population based control / NA / NA
Prevalence/incidence / 1. Include non-surviving subject in the study through proxy interviews / NA / NA
2. Use incident cases
Detection / NA / 1. Case and controls should be restricted to patients who have under gone identical detection manoeuvres / 1. exposed and unexposed subjects should be under identical disease detection
Membership / 1. Difficult to prevent in these four designs
2. Use multiple comparison cohorts
Healthy worker effect / NA / NA /
  1. Use working cohorts for comparison
  2. 2. Use multiple comparison cohorts

Volunteer /
  1. Use repeated contacts or questionnaire to achieve response rate of at least 80%
  2. Compare respondents with a sample of nonrespondents

Loss to follow-up / NA / NA / 1. Maintain a high follow-up rate
TABLE 5
Prevention of information Bias in Basic Study Design
Study Design
Type of Bias information / Cross sectional / Case-Control / Retrospective Cohort / Prospective Cohort
Interview /
  1. “Binding” of the interviewer with respect to the study hypothesis
2.Use a trained and experienced interviewer
Interinterviewer /
  1. Use only one interviewer in the study
  2. Train interviewers according to standard protocols
  3. Use the same interviewer for study and comparison groups
4.Discard data from incompetent interviewers
Questionnaire /
  1. Careful wording to avoid leading questions
  2. Pretest questionnaire several times
  3. Use dummy question to conceal hypothesis
4. Offer categorized values for subjects to select instead of requesting specific values
Recall / 1. Difficult to prevent. May be measured by asking questions whose answers may be checked against records / NA / NA
Diagnostic suspicion / 1. Difficult to prevent / 1. Both exposed and non-exposed groups should be observed using comparable methods
Exposure suspicion / 1. Difficult to prevent / 1. Both cases and controls should be observed using comparable methods / NA / NA
TABLE 6
Prevention of confounding Bias in Cross-Sectional and Matched and Unmatched Case-Control and Cohort
Case- Control / Cohort
Cross-sectional / Matched / Unmatched / Matched / Unmatched
Although control covariates ensures unbiasedness, unnecessary control for nonconfouding covariates always reduces power of the study / Crude estimate is always unbiased / Same as case control unmatched studies
To maximize both validity and power, an investigator should always perform analyses controlling (adjusted estimates) and not controlling (crude estimates) for the covariate(s)
If both estimates are similar, then the crude estimate is unbiased and should be adopted on power considerations.
If both estimates are not similar then the adjusted estimate, which is the only unbiased one should be used