Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data.

Publication Type:
Journal Article
Bioinformatics (Oxford, England), 2012, 28 (15), pp. 1998 - 2003
Issue Date:
Full metadata record
Files in This Item:
Filename Description Size
bts306.pdfPublished Version127.38 kB
Adobe PDF
Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets.Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions.The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental
Please use this identifier to cite or link to this item: