Identification of important regressor groups, subgroups and individuals via regularization methods: application to gut microbiome data.

Oxford University Press (OUP)
Publication Type:
Journal Article
Bioinformatics, 2014, 30 (6), pp. 831 - 837
Issue Date:
Full metadata record
Files in This Item:
Filename Description Size
watermark (1).pdfPublished Version200.93 kB
Adobe PDF
MOTIVATION: Gut microbiota can be classified at multiple taxonomy levels. Strategies to use changes in microbiota composition to effect health improvements require knowing at which taxonomy level interventions should be aimed. Identifying these important levels is difficult, however, because most statistical methods only consider when the microbiota are classified at one taxonomy level, not multiple. RESULTS: Using L1 and L2 regularizations, we developed a new variable selection method that identifies important features at multiple taxonomy levels. The regularization parameters are chosen by a new, data-adaptive, repeated cross-validation approach, which performed well. In simulation studies, our method outperformed competing methods: it more often selected significant variables, and had small false discovery rates and acceptable false-positive rates. Applying our method to gut microbiota data, we found which taxonomic levels were most altered by specific interventions or physiological status. AVAILABILITY: The new approach is implemented in an R package, which is freely available from the corresponding author. CONTACT: SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Please use this identifier to cite or link to this item: