International Biometric Society

STATISTICAL METHODS FOR MASS SPETROMETRY DATA ANALYSIS AND IDENTIFICATION OF PROSTATE CANCER BIOMARKERS

Padoan A.1,4, La Malfa M.1, Basso D.1, Prayer-Galetti T.2, Di Chiara A.3, Pavanello G.3,Zattoni F.2, Plebani M.1, Bellocco R.4

1Department of Medicine (DIMED) and2Department of Surgical, Oncological and Gastroenterological Sciences(DISCOG),University of Padova, Padova, Italy;3SIPRES, Gruppo Pavanello, Padova, Italy;4Department of Statistics and Quantitative Methods, University of Milano-Bicocca, Milan, Italy.

Background: Patients with prostate cancer (PCa)sometimes refer lower urinary tract symptoms (LUTS), and they usually receive medicalinvestigations based on Prostate Specific Antigen (PSA) and Digital Rectal Examination (DRE). However, due to PSA low sensitivity/specificity inpredicting positive prostate biopsy, the identification of new PCa biomarkers is actually a real need. MALDI-TOF/MS protein profiling could be a valuable technology for biomarkers identification. However, up to now its use is laden with lack of reproducibility that confounds scientific inferences and limits its broader use.

Aims: In this study ouroverall objective is to evaluate urine collected after DRE in patients referring LUTS, to identify candidate biomarkers for PCa, by using MALDI-TOF/MS. We focused on some important aspects of proteomic profiling, to assess features reproducibility and to propose appropriate strategies to handle measurement errors and limit of detection (LOD) problems.

Methods: In this cross-sectional study, we collected urine obtained after DRE from 205 patients that referred LUTS, and all patients further undergone to prostate biopsy. Urines were dialyzed and analyzed by MALDI-TOF/MS in reflectron mode. For the intra- and inter-run reproducibility evaluation, we evaluateda urine pooled from 10 reference samples, spiked with 12.58 pmol of a 1589.9 Da internal standard (IS) peptide. To estimate the signal detection limit (sLOD), serial dilution up to 1/256 of the urine pooledwere analyzed in triplicate. We evaluated the sLOD and adjusted the data appropriately to reduce variability by usingalso normalization approaches - the mean, median, internal standard, relative intensity, total ion current and linear rescaling normalization. An optimized signal detection strategy was also evaluated. Measurement errors were evaluated by an external dataset made of urine repeatedly collected from 20 reference subjects. Intra class correlation coefficient (ICC), Regression Calibration (RCAL) and SIMEX analyses were used to estimate unbiased logistic regression coefficients. Monte Carlo simulations were used to estimate the influence of different LOD adjustment methods on ICC and RCAL.

Results: Initially,we evaluated the intra- and inter-runby data obtained from automatic peak detection.Normalizations performed almost similarly in both studies, except IS, which resulted in an increased CV (132% and 212%, respectively). After sLOD adjustment, raw and normalized data showed a reduction in CVs while median normalizations performed better, especially in the intra-run study. By optimizing the peak signal detection, the overall features variability drastically decreased. Median normalization with sLOD correction remained the preferable choice for further analysesbeing the intra- and inter-run CVs 20% and 23%,respectively. Evaluating the measurement error we found that most of the MALDI-TOF/MS variability is intrinsic to the biological matrix. By using substitution of below LOD values by LOD/2, simulation studies showed that ICC estimations were poorly affected by LOD, when measurement error σ is less that 0.36 and values below LOD are less that 50%. Comparing results from naïvelogistic regression, RCAL and SIMEX, measurement error appeared to cause a "bias toward the null". However, SIMEX estimations seemed to correct for a smaller amount of bias than RCAL. Overall, we found eight MALDI-TOF/MS features associated with positive biopsy results.

CONCLUSIONS: Findings from the reproducibility study showed that the major contributing factor to MALDI-TOF/MS variability is the peak finding process.A new algorithm suited for MALDI-TOF reflectron mode is desirable for profiling studies. However, normalization strategies aids in increasing MALDI-TOF/MS data reproducibility, especially with sLOD correction. RCAL and SIMEX appeared valuable approaches to obtain regression coefficients adjusted for biological and instrumental errors on MALDI-TOF/MS features.

International Biometric Conference, Florence, ITALY, 6 – 11July 2014