Sparse Regression by Projection and Sparse Discriminant Analysis

Qi, X; Luo, R; Carroll, RJ; Zhao, H

Sparse Regression by Projection and Sparse Discriminant Analysis

Qi, X Luo, R Carroll, RJ Zhao, H

Permalink

Publication Type:: Journal Article
Citation:: Journal of Computational and Graphical Statistics, 2015, 24 (2), pp. 416 - 438
Issue Date:: 2015-01-01

Closed Access

	Filename	Description	Size
	\\utsfs.adsroot.uts.edu.au\homes\staff\108848\Desktop\10618600.2014.pdf	Published Version	719.97 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Qi, X	en_US
dc.contributor.author	Luo, R	en_US
dc.contributor.author	Carroll, RJ	en_US
dc.contributor.author	Zhao, H	en_US
dc.date.issued	2015-01-01	en_US
dc.identifier.citation	Journal of Computational and Graphical Statistics, 2015, 24 (2), pp. 416 - 438	en_US
dc.identifier.issn	1061-8600	en_US
dc.identifier.uri	http://hdl.handle.net/10453/118357
dc.description.abstract	© 2015, © American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high-dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross-validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares, and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared with the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplementary materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.	en_US
dc.relation.ispartof	Journal of Computational and Graphical Statistics	en_US
dc.relation.isbasedon	10.1080/10618600.2014.907094	en_US
dc.subject.classification	Statistics & Probability	en_US
dc.title	Sparse Regression by Projection and Sparse Discriminant Analysis	en_US
dc.type	Journal Article
utslib.citation.volume	2	en_US
utslib.citation.volume	24	en_US
utslib.for	0104 Statistics	en_US
utslib.for	1403 Econometrics	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Science
pubs.organisational-group	/University of Technology Sydney/Faculty of Science/School of Mathematical and Physical Sciences
utslib.copyright.status	closed_access
pubs.issue	2	en_US
pubs.publication-status	Published	en_US
pubs.volume	24	en_US

Abstract:

© 2015, © American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high-dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross-validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares, and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared with the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplementary materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/118357