October 14, 2010

Chapter 5

Studies of Diagnostic Tests

Extra Problems

Copyright 2010, Thomas B. Newman and Michael A. Kohn

Supplementary to: Newman TB, Kohn MA. Evidence-based diagnosis. Cambridge ; New York: Cambridge University Press; 2009.

7. Tostudy the Alveolar-arterial (A-a) O2 gradient[1] as a predictor of pulmonary embolism (PE; a blood clot in the lungs), investigators reviewed records of 78 patients without significant pre-existing cardiopulmonary disease who were diagnosed with a PE by pulmonary angiogram at two medical centers.(Cvitanic and Marino 1989) They found an “increased” A-aO2 gradient in 74 out of the 78 patients with PE.

a) Can you calculate sensitivity or specificity of “increased” A-aO2 gradient for angiographically proven PE from the above summary of this study? Do so if you can.

b) The clinical decision to obtain pulmonary angiograms was probably influenced by the A-aO2 gradient calculation. How would this effect the sensitivity/specificity estimates calculated in (a) above?

8. Tanz et al (Tanz, Gerber et al. 2009) studied rapid strep tests and office-based throat cultures in 1848 children 3 to 18 years old with acute pharyngitis, using hospital-based throat culture as the gold standard. They found that the sensitivity of the rapid strep test was 49% in a group with low (average 19%) prior probability of strep, compared with 78% in a group with higher (average 38%) prior probability of strep. Specificity was less affected (97.8% in the low prior probability group and 96.7% in the high prior probability group.)
a. If you assume this result is generally true, name the bias that would occur if you used a LR(-) calculated from a study with high prior probability on a patient whose probability was a lot lower.

b. Similar differences in sensitivity and specificity depending on prior probability were also observed for the office-based throat cultures (again, compared with the gold standard hospital-based culture) . What would be a possible biological mechanism that could explain these results?

9 . In the Chapter 4, Problem 10, we showed you ROC curves for Troponin T as a test for myocardial infarction from a study by Keller et al (Keller, Zeller et al. 2009) That problem focused on the performance of Troponin T, but the main objective of the study was to compare a new highly sensitive assay for Troponin I with other markers, not just Troponin T, but also myoglobin, creatine kinase MB, and creatine kinase.

As a review of Chapter 4 material, note that the new Troponin I test is unequivocally better that the other markers at distinguishing between patients with and without myocardial infarction.

The methods section for that paper includes the following text, under the heading: "Adjudication of the Final Diagnosis."

The final discharge diagnosis, which was based on all available clinical, laboratory, and imaging findings, was adjudicated by an expert committee of two independent cardiologists who were unaware of the results of the troponin I assays. If there was disagreement about the final diagnosis, a third cardiologist refereed.

It appears from this that the experts were NOT blinded to the other laboratory tests, such as myoglobin, being evaluated, only to the Troponin I assay.

a.)The authors presented an ROC curve representing the ability of the myoglobin assay to discriminate between MI and non MI patients. If they considered the myoglobin result in making their final diagnosis, the myoglobin ROC curve would be biased. What is the name of this bias?

b)Could that bias explain the very favorable results for Troponin I compared with myoglobin?

c.) (Extra credit) What if the expert independent cardiologists unblinded to the results of the myoglobin assay were trying to make it look bad in order to make the Troponin I test look good? Could they do this? Explain.

10.A paper in the JAMA "Rational Clinical Examination" series reviewed the sensitivity and specificity of various history and physical findings for acute appendicitis. To do this, they reviewed the literature and selected the 11 most relevant studies (Tables 1 and 2). The studies had heterogeneous inclusion criteria:

Inclusion CriteriaN of studies

Emergency Department (ED) evaluation for acute abdomen1

Emergency Department evaluation for suspected appendicitis1

Admitted for acute abdomen1

Admitted for abdominal pain3

Admitted for suspected appendicitis3

Operation for suspected appendicitis2

Consider 2 studies, one from the top and one from the bottom of the list above:

Study 1 -- includes all ED patients evaluated for acute abdomen

Study 2 – includes all patients operated on for suspected appendicitis (i.e. patients who get an appendectomy in the operating room).

a) How would the prevalence (pretest probability) of appendicitis differ between the subjects in Study 1 and Study 2?

b)How would the specificity of “Right Lower Quadrant (RLQ) Pain” differ between Study 1 and Study 2? (Hint: assume that patients with RLQ pain are more likely to be operated upon for suspected appendicitis.)

c)If patients with RLQ pain are likely to go to the operating room sooner, and if appendicitis sometimes resolves spontaneously if given the chance, what bias could this cause, and how would you expect it to affect the sensitivity and specificity of RLQ pain estimated from Study 1?

d.) Repeat part c, assuming appendicitis never resolves spontaneously.

11. A subsequent paper (Bundy, Byerley et al. 2007)in the JAMA Rational Clinical Examination series included the following table describing the accuracy of the symptom of right lower quadrant (RLQ) pain for the diagnosis of appendicitis.

In case you can’t read the table, the circled paper by Pearl et al apparently reported a sensitivity of 0.96 and a specificity of 0.05 for RLQ pain as an indicator of appendicitis. The likelihood ratio for RLQ pain was reported to be 1.0 (95% CI 0.98 – 1.0).

It may seem odd that specificity = 5% so 1 – specificity = 95%. In other words, 95% of the children in the study who did not have appendicitis did have RLQ pain.

a) Did the authors report 1 – specificity instead of specificity?

Table 1 about study characteristics stated that the Pearl study included patients who “underwent nonincidental appendectomy”

b) Does this explain the low specificity of RLQ pain for appendicitis found by the Pearl study? Explain your answer.

References

Bundy, D. G., J. S. Byerley, et al. (2007). "Does this child have appendicitis?" Jama298(4): 438-51.

Cvitanic, O. and P. L. Marino (1989). "Improved use of arterial blood gas analysis in suspected pulmonary embolism." Chest95(1): 48-51.

Keller, T., T. Zeller, et al. (2009). "Sensitive troponin I assay in early diagnosis of acute myocardial infarction." N Engl J Med361(9): 868-77.

Tanz, R. R., M. A. Gerber, et al. (2009). "Performance of a rapid antigen-detection test and throat culture in community pediatric offices: implications for management of pharyngitis." Pediatrics123(2): 437-44.

1

[1]The A-a O2 gradient is the difference between the estimated concentration of oxygen in the lungs (the Alveolar oxygen concentration, which depends mostly on what per cent oxygen the patient is breathing), and the arterial oxygen concentration, which can be measured in the laboratory. An increased gradient suggests there is something interfering with oxygen getting out of the lungs and into the blood. In this study, an "Increased A-a O2 gradient" (in mm Hg) was defined based on age: > 7 (ages 20-30), >10 (ages 30-40), >14 (ages 40-50), >17 (ages 50-60), and >20 (ages 60-70).