Multiclass Learning with Partially Corrupted Labels

Wang, R; Liu, T; Tao, D

Multiclass Learning with Partially Corrupted Labels

Wang, R Liu, T

Tao, D

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Neural Networks and Learning Systems, 2018, 29 (6), pp. 2568 - 2580
Issue Date:: 2018-06-01

Closed Access

	Filename	Description	Size
	07929355.pdf	Published Version	2.2 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Wang, R	en_US
dc.contributor.author	Liu, T https://orcid.org/0000-0002-9640-6472	en_US
dc.contributor.author	Tao, D https://orcid.org/0000-0001-7225-5449	en_US
dc.date.issued	2018-06-01	en_US
dc.identifier.citation	IEEE Transactions on Neural Networks and Learning Systems, 2018, 29 (6), pp. 2568 - 2580	en_US
dc.identifier.issn	2162-237X	en_US
dc.identifier.uri	http://hdl.handle.net/10453/123845
dc.description.abstract	© 2012 IEEE. Traditional classification systems rely heavily on sufficient training data with accurate labels. However, the quality of the collected data depends on the labelers, among which inexperienced labelers may exist and produce unexpected labels that may degrade the performance of a learning system. In this paper, we investigate the multiclass classification problem where a certain amount of training examples are randomly labeled. Specifically, we show that this issue can be formulated as a label noise problem. To perform multiclass classification, we employ the widely used importance reweighting strategy to enable the learning on noisy data to more closely reflect the results on noise-free data. We illustrate the applicability of this strategy to any surrogate loss functions and to different classification settings. The proportion of randomly labeled examples is proved to be upper bounded and can be estimated under a mild condition. The convergence analysis ensures the consistency of the learned classifier to the optimal classifier with respect to clean data. Two instantiations of the proposed strategy are also introduced. Experiments on synthetic and real data verify that our approach yields improvements over the traditional classifiers as well as the robust classifiers. Moreover, we empirically demonstrate that the proposed strategy is effective even on asymmetrically noisy data.	en_US
dc.relation.ispartof	IEEE Transactions on Neural Networks and Learning Systems	en_US
dc.relation.isbasedon	10.1109/TNNLS.2017.2699783	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Multiclass Learning with Partially Corrupted Labels	en_US
dc.type	Journal Article
utslib.citation.volume	6	en_US
utslib.citation.volume	29	en_US
utslib.for	0805 Distributed Computing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	closed_access
pubs.issue	6	en_US
pubs.publication-status	Published	en_US
pubs.volume	29	en_US

Abstract:

© 2012 IEEE. Traditional classification systems rely heavily on sufficient training data with accurate labels. However, the quality of the collected data depends on the labelers, among which inexperienced labelers may exist and produce unexpected labels that may degrade the performance of a learning system. In this paper, we investigate the multiclass classification problem where a certain amount of training examples are randomly labeled. Specifically, we show that this issue can be formulated as a label noise problem. To perform multiclass classification, we employ the widely used importance reweighting strategy to enable the learning on noisy data to more closely reflect the results on noise-free data. We illustrate the applicability of this strategy to any surrogate loss functions and to different classification settings. The proportion of randomly labeled examples is proved to be upper bounded and can be estimated under a mild condition. The convergence analysis ensures the consistency of the learned classifier to the optimal classifier with respect to clean data. Two instantiations of the proposed strategy are also introduced. Experiments on synthetic and real data verify that our approach yields improvements over the traditional classifiers as well as the robust classifiers. Moreover, we empirically demonstrate that the proposed strategy is effective even on asymmetrically noisy data.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/123845