Towards ultrahigh dimensional feature selection for big data

Tan, M; Tsang, IW; Wang, L

Towards ultrahigh dimensional feature selection for big data

Tan, M Tsang, IW

Wang, L

Permalink

Publication Type:: Journal Article
Citation:: Journal of Machine Learning Research, 2014, 15 pp. 1371 - 1429
Issue Date:: 2014-01-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published VersionAdobe PDF (1 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Tan, M	en_US
dc.contributor.author	Tsang, IW https://orcid.org/0000-0001-8095-4637	en_US
dc.contributor.author	Wang, L	en_US
dc.date.issued	2014-01-01	en_US
dc.identifier.citation	Journal of Machine Learning Research, 2014, 15 pp. 1371 - 1429	en_US
dc.identifier.issn	1532-4435	en_US
dc.identifier.uri	http://hdl.handle.net/10453/121688
dc.description.abstract	In this paper, we present a new adaptive feature scaling scheme for ultrahigh-dimensional feature selection on Big Data, and then reformulate it as a convex semi-infinite programming (SIP) problem. To address the SIP, we propose an eficient feature generating paradigm. Different from traditional gradient-based approaches that conduct optimization on all input features, the proposed paradigm iteratively activates a group of features, and solves a sequence of multiple kernel learning (MKL) subproblems. To further speed up the training, we propose to solve the MKL subproblems in their primal forms through a modified accelerated proximal gradient approach. Due to such optimization scheme, some eficient cache techniques are also developed. The feature generating paradigm is guaranteed to converge globally under mild conditions, and can achieve lower feature selection bias. Moreover, the proposed method can tackle two challenging tasks in feature selection: 1) group-based feature selection with complex structures, and 2) nonlinear feature selection with explicit feature mappings. Comprehensive experiments on a wide range of synthetic and real-world data sets of tens of million data points with O(1014) features demonstrate the competitive performance of the proposed method over state-of-the-art feature selection methods in terms of generalization performance and training eficiency. © 2014 Mingkui Tan, Ivor W. Tsang and Li Wang.	en_US
dc.relation	http://purl.org/au-research/grants/arc/FT130100746
dc.relation.ispartof	Journal of Machine Learning Research	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Towards ultrahigh dimensional feature selection for big data	en_US
dc.type	Journal Article
utslib.citation.volume	15	en_US
utslib.for	080101 Adaptive Agents and Intelligent Robotics	en_US
utslib.for	080109 Pattern Recognition and Data Mining	en_US
utslib.for	08 Information and Computing Sciences	en_US
utslib.for	17 Psychology and Cognitive Sciences	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	open_access
pubs.publication-status	Published	en_US
pubs.volume	15	en_US

Abstract:

In this paper, we present a new adaptive feature scaling scheme for ultrahigh-dimensional feature selection on Big Data, and then reformulate it as a convex semi-infinite programming (SIP) problem. To address the SIP, we propose an eficient feature generating paradigm. Different from traditional gradient-based approaches that conduct optimization on all input features, the proposed paradigm iteratively activates a group of features, and solves a sequence of multiple kernel learning (MKL) subproblems. To further speed up the training, we propose to solve the MKL subproblems in their primal forms through a modified accelerated proximal gradient approach. Due to such optimization scheme, some eficient cache techniques are also developed. The feature generating paradigm is guaranteed to converge globally under mild conditions, and can achieve lower feature selection bias. Moreover, the proposed method can tackle two challenging tasks in feature selection: 1) group-based feature selection with complex structures, and 2) nonlinear feature selection with explicit feature mappings. Comprehensive experiments on a wide range of synthetic and real-world data sets of tens of million data points with O(1014) features demonstrate the competitive performance of the proposed method over state-of-the-art feature selection methods in terms of generalization performance and training eficiency. © 2014 Mingkui Tan, Ivor W. Tsang and Li Wang.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/121688