Identification of important regressor groups, subgroups and individuals via regularization methods: Application to gut microbiome data

Garcia, TP; Müller, S; Carroll, RJ; Walzem, RL

Identification of important regressor groups, subgroups and individuals via regularization methods: Application to gut microbiome data

Garcia, TP Müller, S Carroll, RJ Walzem, RL

Permalink

Publication Type:: Journal Article
Citation:: Bioinformatics, 2014, 30 (6), pp. 831 - 837
Issue Date:: 2014-03-01

Closed Access

	Filename	Description	Size
	watermark (1).pdf	Published Version	200.93 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Garcia, TP	en_US
dc.contributor.author	Müller, S	en_US
dc.contributor.author	Carroll, RJ	en_US
dc.contributor.author	Walzem, RL	en_US
dc.date.issued	2014-03-01	en_US
dc.identifier.citation	Bioinformatics, 2014, 30 (6), pp. 831 - 837	en_US
dc.identifier.issn	1367-4803	en_US
dc.identifier.uri	http://hdl.handle.net/10453/117872
dc.description.abstract	Motivation: Gut microbiota can be classified at multiple taxonomy levels. Strategies to use changes in microbiota composition to effect health improvements require knowing at which taxonomy level interventions should be aimed. Identifying these important levels is difficult, however, because most statistical methods only consider when the microbiota are classified at one taxonomy level, not multiple.Results: Using L1 and L2 regularizations, we developed a new variable selection method that identifies important features at multiple taxonomy levels. The regularization parameters are chosen by a new, data-adaptive, repeated cross-validation approach, which performed well. In simulation studies, our method outperformed competing methods: it more often selected significant variables, and had small false discovery rates and acceptable false-positive rates. Applying our method to gut microbiota data, we found which taxonomic levels were most altered by specific interventions or physiological status.Availability: The new approach is implemented in an R package, which is freely available from the corresponding author. © 2013 The Author 2013. Published by Oxford University Press. All rights reserved.	en_US
dc.relation.ispartof	Bioinformatics	en_US
dc.relation.isbasedon	10.1093/bioinformatics/btt608	en_US
dc.subject.classification	Bioinformatics	en_US
dc.subject.mesh	Gastrointestinal Tract	en_US
dc.subject.mesh	Animals	en_US
dc.subject.mesh	Humans	en_US
dc.subject.mesh	Mice	en_US
dc.subject.mesh	Software	en_US
dc.subject.mesh	Microbiota	en_US
dc.title	Identification of important regressor groups, subgroups and individuals via regularization methods: Application to gut microbiome data	en_US
dc.type	Journal Article
utslib.citation.volume	6	en_US
utslib.citation.volume	30	en_US
utslib.for	060102 Bioinformatics	en_US
utslib.for	01 Mathematical Sciences	en_US
utslib.for	06 Biological Sciences	en_US
utslib.for	08 Information and Computing Sciences	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Science
pubs.organisational-group	/University of Technology Sydney/Faculty of Science/School of Mathematical and Physical Sciences
utslib.copyright.status	closed_access
pubs.issue	6	en_US
pubs.publication-status	Published	en_US
pubs.volume	30	en_US

Abstract:

Motivation: Gut microbiota can be classified at multiple taxonomy levels. Strategies to use changes in microbiota composition to effect health improvements require knowing at which taxonomy level interventions should be aimed. Identifying these important levels is difficult, however, because most statistical methods only consider when the microbiota are classified at one taxonomy level, not multiple.Results: Using L1 and L2 regularizations, we developed a new variable selection method that identifies important features at multiple taxonomy levels. The regularization parameters are chosen by a new, data-adaptive, repeated cross-validation approach, which performed well. In simulation studies, our method outperformed competing methods: it more often selected significant variables, and had small false discovery rates and acceptable false-positive rates. Applying our method to gut microbiota data, we found which taxonomic levels were most altered by specific interventions or physiological status.Availability: The new approach is implemented in an R package, which is freely available from the corresponding author. © 2013 The Author 2013. Published by Oxford University Press. All rights reserved.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/117872