A unified feature selection framework for graph embedding on high dimensional data

Chen, M; Tsang, IW; Tan, M; Cham, TJ

A unified feature selection framework for graph embedding on high dimensional data

Chen, M Tsang, IW

Tan, M Cham, TJ

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Knowledge and Data Engineering, 2015, 27 (6), pp. 1465 - 1477
Issue Date:: 2015-06-01

Closed Access

	Filename	Description	Size
	a u.pdf	Published Version	960.13 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Chen, M	en_US
dc.contributor.author	Tsang, IW https://orcid.org/0000-0001-8095-4637	en_US
dc.contributor.author	Tan, M	en_US
dc.contributor.author	Cham, TJ	en_US
dc.date.issued	2015-06-01	en_US
dc.identifier.citation	IEEE Transactions on Knowledge and Data Engineering, 2015, 27 (6), pp. 1465 - 1477	en_US
dc.identifier.issn	1041-4347	en_US
dc.identifier.uri	http://hdl.handle.net/10453/121742
dc.description.abstract	© 2014 IEEE. Although graph embedding has been a powerful tool for modeling data intrinsic structures, simply employing all features for data structure discovery may result in noise amplification. This is particularly severe for high dimensional data with small samples. To meet this challenge, this paper proposes a novel efficient framework to perform feature selection for graph embedding, in which a category of graph embedding methods is cast as a least squares regression problem. In this framework, a binary feature selector is introduced to naturally handle the feature cardinality in the least squares formulation. The resultant integral programming problem is then relaxed into a convex Quadratically Constrained Quadratic Program (QCQP) learning problem, which can be efficiently solved via a sequence of accelerated proximal gradient (APG) methods. Since each APG optimization is w.r.t. only a subset of features, the proposed method is fast and memory efficient. The proposed framework is applied to several graph embedding learning problems, including supervised, unsupervised, and semi-supervised graph embedding. Experimental results on several high dimensional data demonstrated that the proposed method outperformed the considered state-of-the-art methods.	en_US
dc.relation	http://purl.org/au-research/grants/arc/FT130100746
dc.relation.ispartof	IEEE Transactions on Knowledge and Data Engineering	en_US
dc.relation.isbasedon	10.1109/TKDE.2014.2382599	en_US
dc.subject.classification	Information Systems	en_US
dc.title	A unified feature selection framework for graph embedding on high dimensional data	en_US
dc.type	Journal Article
utslib.citation.volume	6	en_US
utslib.citation.volume	27	en_US
utslib.for	080101 Adaptive Agents and Intelligent Robotics	en_US
utslib.for	080109 Pattern Recognition and Data Mining	en_US
utslib.for	08 Information and Computing Sciences	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.issue	6	en_US
pubs.publication-status	Published	en_US
pubs.volume	27	en_US

Abstract:

© 2014 IEEE. Although graph embedding has been a powerful tool for modeling data intrinsic structures, simply employing all features for data structure discovery may result in noise amplification. This is particularly severe for high dimensional data with small samples. To meet this challenge, this paper proposes a novel efficient framework to perform feature selection for graph embedding, in which a category of graph embedding methods is cast as a least squares regression problem. In this framework, a binary feature selector is introduced to naturally handle the feature cardinality in the least squares formulation. The resultant integral programming problem is then relaxed into a convex Quadratically Constrained Quadratic Program (QCQP) learning problem, which can be efficiently solved via a sequence of accelerated proximal gradient (APG) methods. Since each APG optimization is w.r.t. only a subset of features, the proposed method is fast and memory efficient. The proposed framework is applied to several graph embedding learning problems, including supervised, unsupervised, and semi-supervised graph embedding. Experimental results on several high dimensional data demonstrated that the proposed method outperformed the considered state-of-the-art methods.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/121742