HW3-200B- Winter 2009

Question.1

The most recent publication from the Interphone study (international case-control studies on brain cancers and cell phone use) addresses the association between use of cell phones and the risk of glioma. Gliomas are brain cancers that originate in neuroglia, usually within the cerebral hemispheres. These are the most common brain cancers.

The study is based upon a combined set of case-control studies from the Nordic countries and the UK. The case-control studies were all population based with density sampling of controls from the population that gave rise to the cases. Table 1 provides information on responders, the time since diagnosing and collection of exposure data and how data were collected (the intention was to perform personal interviews at home or in the hospital). They aimed at getting a life long history of cell phone use. You also see the percentage of those who were selected to the study and accepted the invitation.

Table 1. Details of the Cases and Controls

Total
Cases
Included / 1,521
Participation rate / 60%
Number with histopathology / 1,466
Interview lag, median and interquartile range (days) / 92 (39-244)
Interview type
Hospital / 662
Home / 602
Other/missing / 257
Telephone / 166
Controls
Included / 3,301
Participation rate / 50%
Number of telephone interviews / 208

Note: More than one data collection method may have been used.

Based on these data we want you to:

a.Discuss potential problems with selection bias in this study.

Non response was 40% and 50% respectively among cases and controls, enough to cause serious bias. Exposed cancer patients have more reasons to participate (if they know the hypothesis, i.e. have a chance to self-select according to exposure status the). Selection bias is then expected.

b.Now, assume there is no association between cell phone use and you have no information on bias or confounding. If the true exposure frequency in the underlying population is 50% (based on a given definition)

All selected into the study, based on the participation rates mentioned in Table 1.

Exp / Cases / Controls
+
- / 1267.5
1267.5 / 3301
3301
All / 2535 / 6602

OR = = 1.0

i.how much selection bias is needed for the cases to produce an OR of 2, given you have no selection bias (and no sampling error) among controls?

OR Bias = OR True x OR Response %

2 = 1 x 2

1267.5 x X = 2 x 1267.5 x Y

X x 1267.5 + Y x 1267.5 = 1521

X =

X = 1.2 – Y

1267.5 (1.2 – Y) = 2535Y

1521 – 1267.5Y = 2535Y

Y = 0.4

X = 0.8

Participants would be

Exp / Cases (Pr) / Controls (Pr)
+
- / 1014 (80%)
507 (40%) / 1650.5 (50%)
1659.5 (50%)
1521 / 3301

ORbias = x

2=1x 2

These steps are only needed when exposure rates differ from 50. In this case it is obvious that it has to be 40 and 80 with an average of 60 (equal weights in both groups).

ii.which direction of selection bias, among controls only, is needed to produce an OR of 2 if you have no selection bias in cases?

A higher response among non exposed controls than for exposed controls.

Question. 2

  1. Consider a case-control study of oral contraceptive use and breast cancer. There are 350 cases of breast cancer and 350 controls. 200 cases and 131 controls are known to have ever used oral contraceptives.

a)Calculate the odds ratio.

Case Control

OC user200131

OC nonuser150219

Total350350

^OR = (200/150)/(131/219) = 2.23

b)Assume that 10% of the actual OC users were misclassified as nonusers and 10% of the actual nonusers were misclassified as users among both the cases and controls.

Case Control

OC user200-(.1*200)+(.1*150)131-(.1*131)+(.1*219)

=200-20+15=195=131-13.1+21.9=140

OC nonuser150+20-15=155219+13.1-21.9=210

i)What kind of misclassification is this?

This is non-differential misclassification of exposure because misclassification probability is same among breast cancer cases and controls.

ii)Calculate the odds ratio and determine the direction of the bias, if any.

^OR = (195/155)/(140/210) = 1.88

There is negative bias toward the null.

c)Assume that due to recall bias, 10% of OC users among controls were misclassified as nonusers and that 10% of the nonusers among the cases were misclassified as users, but no exposed cases or unexposed controls were misclassified.

Case Control

OC user200+15=215131-(.1*131)=131-13.1=118

OC nonuser150-(.1*150) =150-15=135219+13.1=232

i)What kind of misclassification is this?

This is differential misclassification of exposure because misclassification probability is different for cases and controls. Misclassification results in a higher proportion of cases than controls among the OC users.

ii) Calculate the odds ratio and determine the direction of the bias, if any.

^OR = (215/135)/(118/232) = 3.13

There is positive bias away from the null.

Question 3

1. The capture-recapture methodology was designed to estimate population sizes on the basis of the proportion of subjects (re-)captured by two or more sources. It may be visualized as a 2X2 matrix below.


a. These are the assumptions we discussed for validity of Capture-Recapture technique. Please say in one or two sentences what each means:

i. Population is closed

No immigration, emigration, births or deaths between the release and the recapture times

ii. Individuals captured on both occasions can be matched

Marks (or tags) are not lost and always recognizable

iii. Capture in the second sample is independent of capture in the first

The first capture does not affect the second capture

iv. Probability of capture is homogeneous across individuals

The probabilities of being caught are equal for all individuals

b. If the population is not closed, how could this affect the value of “N” (estimated number of salmon)?

Probability of being recaptured decreases so X11 is decreased and N is overestimated

c. Say that the method used to identify subjects in the capture or recapture samples is social security number. If we further speculate that, in one or the other source, that some social security numbers are missing, how could this affect the estimate of “N”?

True matches may be missed so again X11 is decreased and N is overestimated

d. And if the method used to identify subjects was first initial and last name, how could this affect the estimate of “N”?

Potential for creating false matches is increased, so X11 is increased and N is underestimated.

e. In trying to estimate the number of IV drug users in Bangkok, investigators used lists of people attending methadone clinics as the first capture source, and police arrest logs in the same time period as the re-capture source. How might this affect the assumption of independence of sources and how might that affect the estimate of “N” from capture-recapture analysis?

Those attending methadone clinics are less likely to use/need drugs so are less likely to fall prey to a drug-related arrest. Negative dependence of recapture source on capture source – decreases X11 so N is overestimated.

f. The following is from an evaluation of Legionella Reporting System (NS) in France in 1995. Two other sources were used to perform Capture-Recapture: The National Reference Lab (NRL) and Hospital Laboratories (HL). Use the Venn diagram below to calculate the total number of Legionella cases in France in 1995 using the following Capture- Recapture Design:

i)NS – CaptureNRL – RecaptureTotal = (50*226)/29= 389.7

ii)NS- CaptureHL – RecaptureTotal = (50*357)/29= 615.5

iii)NRL-CaptureHL - RecaptureTotal = (226*357)/156= 517.2

1