A refinement approach to handling model misfit in semi-supervised learning

Su, H; Chen, L; Ye, Y; Sun, Z; Wu, Q

A refinement approach to handling model misfit in semi-supervised learning

Su, H Chen, L

Ye, Y Sun, Z Wu, Q

Permalink

Publication Type:: Conference Proceeding
Citation:: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010, 6441 LNAI (PART 2), pp. 75 - 86
Issue Date:: 2010-12-21

Closed Access

	Filename	Description	Size
	2010003109OK.pdf		441.25 kB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Su, H	en_US
dc.contributor.author	Chen, L https://orcid.org/0000-0002-6468-5729	en_US
dc.contributor.author	Ye, Y	en_US
dc.contributor.author	Sun, Z	en_US
dc.contributor.author	Wu, Q https://orcid.org/0000-0001-5641-2483	en_US
dc.date.issued	2010-12-21	en_US
dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010, 6441 LNAI (PART 2), pp. 75 - 86	en_US
dc.identifier.isbn	3642173128	en_US
dc.identifier.isbn	9783642173127	en_US
dc.identifier.issn	0302-9743	en_US
dc.identifier.uri	http://hdl.handle.net/10453/16187
dc.description.abstract	Semi-supervised learning has been the focus of machine learning and data mining research in the past few years. Various algorithms and techniques have been proposed, from generative models to graph-based algorithms. In this work, we focus on the Cluster-and-Label approaches for semi-supervised classification. Existing cluster-and-label algorithms are based on some underlying models and/or assumptions. When the data fits the model well, the classification accuracy will be high. Otherwise, the accuracy will be low. In this paper, we propose a refinement approach to address the model misfit problem in semi-supervised classification. We show that we do not need to change the cluster-and-label technique itself to make it more flexible. Instead, we propose to use successive refinement clustering of the dataset to correct the model misfit. A series of experiments on UCI benchmarking data sets have shown that the proposed approach outperforms existing cluster-and-label algorithms, as well as traditional semi-supervised classification techniques including Selftraining and Tri-training. © 2010 Springer-Verlag.	en_US
dc.relation.ispartof	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)	en_US
dc.relation.isbasedon	10.1007/978-3-642-17313-4_8	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	A refinement approach to handling model misfit in semi-supervised learning	en_US
dc.type	Conference Proceeding
utslib.citation.volume	PART 2	en_US
utslib.citation.volume	6441 LNAI	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
dc.location.activity	Chongqing, China	en_US
dc.location.activity	ISI:000171459300005
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
pubs.organisational-group	/University of Technology Sydney/Strength - INEXT - Innovation in IT Services and Applications
utslib.copyright.status	closed_access
pubs.issue	PART 2	en_US
pubs.publication-status	Published	en_US
pubs.volume	6441 LNAI	en_US

Abstract:

Semi-supervised learning has been the focus of machine learning and data mining research in the past few years. Various algorithms and techniques have been proposed, from generative models to graph-based algorithms. In this work, we focus on the Cluster-and-Label approaches for semi-supervised classification. Existing cluster-and-label algorithms are based on some underlying models and/or assumptions. When the data fits the model well, the classification accuracy will be high. Otherwise, the accuracy will be low. In this paper, we propose a refinement approach to address the model misfit problem in semi-supervised classification. We show that we do not need to change the cluster-and-label technique itself to make it more flexible. Instead, we propose to use successive refinement clustering of the dataset to correct the model misfit. A series of experiments on UCI benchmarking data sets have shown that the proposed approach outperforms existing cluster-and-label algorithms, as well as traditional semi-supervised classification techniques including Selftraining and Tri-training. © 2010 Springer-Verlag.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/16187