Department of Biostatistics and Medical Informatics
Introdução à Medicina 2005/2006
A systematic review of the validity of endoscopic ultrasound for rectal carcinoma staging
Adília Rafael,/ Agostinho Cordeiro,
/ Alberto Lourenço,
/ Alexandre Sarmento,
Ana B. Noronha,
/ Ana C. Afonso,
/ Ana C. Gomes,
/ Ana C. Pedrosa,
Ana C. Duque,
/ Ana I. Ponte,
Adviser: Mário D. Ribeiro MD PhD, , Class: 1
abstract
Introduction Treatment of rectal cancer depends on correct staging, and endoscopic ultrassound (EUS) is one of the most accurate methods which make it possible.
Aim Our objective was to analyze the validity and consistency of EUS for rectal carcinoma staging in relation to surgical specimens, as well as to evaluate the sensitivity and specificity of the test in identifying the patients as T3/T4 and N+.
Methods For selection and description of studies, we performed a bibliographic research in Medline in order to identify publications on the validity of EUS for rectal carcinoma staging. After this, we tried to collect the articles regarding each abstract obtained, after inclusion and exclusion criteria was applied. Data from the articles obtained was then extracted so that we could evaluate the quality of each article and management of the information to perform. After building the database, we built graphics expressing sensitivity and specificity for T and N staging. Only the graphics which revealed conclusive results were considered.
Results The pooled specificity and sensitivity in T staging was 0,88, in both cases. For N staging, the pooled specificity was 0,76 and the pooled sensitivity was 0,64.
Discussion Sensitivity results for T staging were more homogeneous the specificity results. The high values of pooled specificity and sensitivity in T staging show this is a very exact test. The graphics of sensitivity and specificity in N staging were very heterogeneous. This diagnostic test is also more specific than it was sensible, meaning that it is more useful for positive results. The overall accuracy of EUS in staging rectal cancer was satisfactory, especially for T staging.
Key-words: endosonography, colonoscopy, colorectal neoplasms, sensitivity and specificity.
1
Introduction
Rectal cancer is the most lethal type of cancer within the portuguese population [1]. Its treatment depends on correct staging, and endoscopic ultrassound is one of the methods which makes it possible.
More than 2000 scientific papers published in the literature have demonstrated EUS’s high accuracy for the diagnosis and staging of rectal cancer [2]. It may be used either to determine candidacy for surgery or as a surveillance method after surgery, because the recurrence of local tumors is quite common. Performing EUS before the surgery is useful in several ways – not only it determines the type of surgery but also if preoperative chemotherapy is needed.
Rectal cancer is staged using the Tumor-Node-Metastasis (TNM) staging system. According to the EUS stage, the management of the cancer is also different [3] (see Table 1). Careful assessment of the T and N stages is critical in directing treatment: local resections with curative intent are limited to patients with T1N0 or T2N0 rectal cancers, while patients with more advanced lesions undergo neoadjuvant chemoradiation followed by radical resection [4].
Table 1. Tumor-Node-Metastasis (TNM) staging system and management of cancer according to stage.
Stage / Involves / Management of cancerT1 / Mucosa/submucosa / Transanal local resection
T2 / Into the muscularis propria / Radical resection and/or postoperative radiation
T3 / Into the perirectal fat / Preoperative chemoradiation before radical resection
T4 / Into adjacent organs
N1 / Metastasis in 1 to 3 regional lymph nodes
N2 / Metastasis in 4 or more regional lymph nodes
In an EUS image, tumors appear as hypoechoic (darker) masses. If they are visualized in an interface of layers, there may be some difficulty in staging the cancer, which requires a more precise description of the tumor in order to accurately define it. Lymph nodes also appear as hypoechoic round or oval structures. As regular lymph nodes are not perceptible in an EUS image, any lymph node visualized should be considered potentially malignant.
The accuracy of EUS may be limited by its inability to separate the tumor from hypoechoic inflammation around it or by tangential images when the tumor is located in specific places. These flaws usually result in overstaging rather than understaging, which only occurs when the invasion of the tumor into the next layer is still microscopic. The identification of lymph nodes also carries some difficulties, especially in small or distant ones or in the distinction between inflammatory or metastatic lymph nodes. Despite these limitations, EUS is considered to be the most accurate method for staging rectal carcinomas, in relation to other diagnostic techniques, such as digital examination, CT and magnetic resonance imaging (MRI).
The aim of our systematic review is to evaluate the validity and consistency of EUS for rectal carcinoma staging in relation to surgical specimens, as well as to evaluate the sensitivity and specificity of the test in identifying the patients as T3/T4 and N+.
Participants and Methods
I. Selection and description of studies
1. Bibliographic Research
An extensive search was carried out in Medline in order to identify publications on the validity of EUS for rectal carcinoma staging. The research was limited to “items with abstracts” and 167 articles were obtained using the following query.
(((((((((("sensitivity and specificity"[All Fields] OR "sensitivity and specificity/standards"[All Fields]) OR "specificity"[All Fields]) OR "screening"[All Fields]) OR "false positive"[All Fields]) OR "false negative"[All Fields]) OR "accuracy"[All Fields]) OR (((("predictive value"[All Fields] OR "predictive value of tests"[All Fields]) OR "predictive value of tests/standards"[All Fields]) OR "predictive values"[All Fields]) OR "predictive values of tests"[All Fields])) OR (("reference value"[All Fields] OR "reference values"[All Fields]) OR"reference values/standards"[All Fields])) OR ((((((((((("roc"[All Fields] OR "roc analyses"[All Fields]) OR "roc analysis"[All Fields]) OR "roc and"[All Fields]) OR "roc area"[All Fields]) OR "roc auc"[All Fields]) OR "roc characteristics"[All Fields]) OR "roc curve"[All Fields]) OR "roc curve method"[All Fields]) OR "roc curves"[All Fields]) OR "roc estimated"[All Fields]) OR "roc evaluation"[All Fields])) OR "likelihood ratio"[All Fields]) AND (("Endoscopic Ultrasound" [All Fields] OR Endosonography"[All Fields]) AND ("Rectal neoplasms"[All Fields] OR "Colorectal neoplasms"[All Fields]))
2. Systematic Review
Each abstract of those 167 articles was read by two reviewers who applied inclusion and exclusion criteria previously defined; 82 articles were considered adequate, of which only 34 full papers were found. Each complete article was read by two reviewers who once again applied inclusion and exclusion criteria; 20 articles were selected. A third element was always consulted in case of disagreement (see flow chart).
a) Inclusion Criteria
All articles selected describe studies designed to evaluate the accuracy of EUS in rectal carcinoma staging. The accuracy of EUS is evaluated in a sample of patients with rectal carcinoma and the findings of EUS are compared to the surgical specimen – gold standard; such results must allow the construction of 2×2 tables.
b) Exclusion Criteria
Articles which described systematic reviews, which used a different reference standard, or evaluated the accuracy of EUS in staging of other cancers rather than rectal carcinoma were excluded. In addition to this, studies with a sample of less than 10 patients were not considered, in order to guarantee a higher fidelity of the results. Finally, articles which could not be found in the Internet, the facilities of the School of Medicine or the local libraries were excluded, as well as those written in languages other than English, French, Spanish or Portuguese.
3. Evaluating the quality of the articles
The STARD checklist [5] was used to evaluate the quality of the selected articles. It includes 25 relevant items in the design and conduct of the study, the execution of tests, and the results. All of the 20 articles selected exhibited the majority of items considered in the STARD checklist.
II. Management of the information
Each selected article was read by two reviewers who extracted relevant data.
1. Definition of variables
Variables were defined according to the characteristics of the study population, such as the initial number of participants, mean age, number of feminine participants and number of masculine participants. Details about the exam itself were also introduced as variables, namely the type of instrument and number of operators, as well as the location of lesion and the number of true positives, true negatives, false positives and false negatives, according to the TNM staging.
Moreover, it was also considered in this description of variables the evaluation of the quality in articles, as referred before, by the STARD checklist. Each item of this checklist was added as a variable.
2. Creation of a database
The data related to the selected studies was compiled in an electronic database, built in the software “SPSS 13.0 for Windows”.
3. Statistic analysis of data
After building the database using SPSS, the variables expressing false positives, false negatives, true positives and true negatives for both T and N staging were inserted into “Meta-Disc 3.1” and used to built graphics expressing sensitivity and specificity for T and N staging. The articles were ordered according to different variables – date of publication, final number and mean age of participants, brand of the instrument used in the study, location where it was performed - in an attempt to discover homogeneity in a particular group of articles which shared the same characteristics. Only the graphics which revealed conclusive results were considered.
Results
Of the 34 selected articles, only 20 were possible to be analysed, as we did not have access to the full paper of the remaining 14 (see Table 2).
Quality assessment was made of the articles, according to the number of items from the STARD checklist present in each article. From the 25 items in the STARD checklist, the maximum exhibited was 20, for the articles of Akasu T, 1997 [14] and Lee P, 1999 [19], and the minimum was 7, for the article of Gualdi F, 2000 [23]. Eighteen of the 20 articles had more than 13 items.
The final number of participants in a study will influence the weight the study will have in evaluating its sensitivity and specificity: when more patients were analysed, the study is more reliable. The brand of instrument used will also be important as most of the studies used Olympus.
The location where the studies were performed and the mean age of participants in each study revealed no conclusive results.
Table 2. Results – different variables analysed.
Article / Location / Final number ofparticipants / Mean age of participants / Quality (STARD checklist items) / Brand of instrument used / T staging / N staging
Sensitivity
(CI) / Specificity
(CI) / Sensitivity
(CI) / Specificity
(CI)
Akasu T, 1997 / Japan / 164 / 59 / 20 / Olympus / 0,96
(0,90 – 0,99) / 0,82
(0,71 – 0,91) / 0,77
(0,67 – 0,85) / 0,74
(0,62 – 0,84)
Sailer M, 1997 / Germany / 154 / - / 18 / Combison / 0,83
(0,73 – 0,90) / 0,97
(0.90 – 1,00) / - / -
Maldjian C, 1998 / USA / 14 / 37 / 16 / Olympus / 0.80
(0,28 – 0,99) / 0,78
(0,40 – 0,97) / 0,75
(0,19 – 0,99) / 0,56
(0,21 – 0,86)
Nishimori H, 1998 / Japan / 70 / 62 / 13 / Olympus / 0,94
(0,73 – 1,00) / 1,00
(0,59 – 1,00) / 0,36
(0,17 – 0,59) / 0,90
(0,76 – 0,97)
Blomquist L, 1999 / Sweden / 49 / 70 / 16 / Olympus / 0,91
(0,75 – 0,98) / 0,35
(0,14 – 0,62) / 0,75
(0,43 – 0,95) / 0,54
(0,37 – 0,71)
Lee P, 1999 / USA / 34 / - / 20 / B&K / 1,00
(0,59 – 1,00) / 0,89
(0,71 – 0,98) / - / -
Hunerbein M, 1999 / Germany / 63 / 62 / 15 / B&K / 0,92
(0,73 – 0,99) / 0,97
(0,87 – 1,00) / 0,50
(0,21 – 0,79) / 0,95
(0,84 – 0,99)
Kazuya A, 2000 / Japan / 39 / 70 / 16 / - / 0,81
(0,61 – 0,93) / 0,92
(0,64 – 1,00) / 0,83
(0,52 – 0,98) / 0,65
(0,41 – 0,85)
Akasu T, 2000 / Japan / 154 / 57 / 16 / - / 0,96
(0,92 – 0,99) / 0,82
(0,75 – 0,88) / - / -
Gualdi F, 2000 / Italy / 26 / 69 / 14 / - / 0,93
(0,66 – 1,00) / 0,58
(0,28 – 0,85) / - / -
Hunerbein M, 2000 / Germany / 30 / 65 / 7 / - / 0,67
(0,09 – 0,99) / 0,96
(0,81 – 1,00) / - / -
Akahoshi K, 2001 / Japan / 159 / 68 / 9 / - / 0,90
(0,81 – 0,95) / 0,85
(0,76 – 0,93) / 0,70
(0,55 – 0,83) / 0,64
(0,52 – 0,76)
Kalantzis C, 2002 / Greece / 80 / 70 / 17 / Olympus / 0,97
(0,90 – 1,00) / 1,00
(0,75 – 1,00) / 1,00
(0,92 – 1,00) / 0,86
(0,71 – 0,95)
Starck M, 2002 / Sweden / 60 / 70 / 18 / B&K / 0,50
(0,07 – 0,93) / 0,96
(0,88 – 1,00) / - / -
Scott R, 2002 / USA / 45 / 62 / 18 / - / 0,68
(0,46 – 0,85) / 0,85
(0,62 – 0,97) / 0,60
(0,15 – 0,95) / 0,78
(0,62 – 0,89)
Tseng Y, 2002 / Chine / 86 / 62 / 18 / - / 0,91
(0,77 – 0,98) / 0,91
(0,71 – 0,99) / - / -
Garcia-Aguilar J, 2002 / USA / 545 / 63 / 16 / Olympus / 0,78
(0,70 – 0,85) / 0,91
(0,87 – 0,93) / 0,33
(0,24 – 0,44) / 0,82
(0,75 – 0,88)
Fuchsjager M, 2002 / Austria / 28 / 65 / 17 / - / 0,93
(0,66 – 1,00) / 0,71
(0,42 – 0,92) / 0,92
(0,64 – 1,00) / 0,71
(0,42 – 0,92)
Bali C, 2004 / Greece / 31 / 70 / 18 / B&K / 0,92
(0,73 – 0,99) / 0,40
(0,05 – 0,85) / 0,50
(0,21 – 0,79) / 0,65
(0,38 – 0,86)
Hurlstone P, 2005 / UK / 52 / 63 / 16 / Olympus / 0,48
(0,29 – 0,68) / 0,80
(0,59 – 0,93) / 0,40
(0,12 – 0,74) / 0,47
(0,23 – 0,72)
T staging