Self-paced multi-view co-training

Ma, F; Meng, D; Dong, X; Yang, Y

Self-paced multi-view co-training

Ma, F Meng, D Dong, X

Yang, Y

Permalink

Publication Type:: Journal Article
Citation:: Journal of Machine Learning Research, 2020, 21, pp. 1-38
Issue Date:: 2020-04-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download full textAdobe PDF (6.55 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Ma, F
dc.contributor.author	Meng, D
dc.contributor.author	Dong, X https://orcid.org/0000-0001-9272-1590
dc.contributor.author	Yang, Y https://orcid.org/0000-0001-5528-0546
dc.date.accessioned	2021-03-16T01:02:23Z
dc.date.available	2021-03-16T01:02:23Z
dc.date.issued	2020-04-01
dc.identifier.citation	Journal of Machine Learning Research, 2020, 21, pp. 1-38
dc.identifier.issn	1532-4435
dc.identifier.issn	1533-7928
dc.identifier.uri	http://hdl.handle.net/10453/147218
dc.description.abstract	© 2020 Fan Ma, Deyu Meng, Xuanyi Dong and Yi Yang. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v21/18-794.html. Co-training is a well-known semi-supervised learning approach which trains classifiers on two or more different views and exchanges pseudo labels of unlabeled instances in an iterative way. During the co-training process, pseudo labels of unlabeled instances are very likely to be false especially in the initial training, while the standard co-training algorithm adopts a “draw without replacement” strategy and does not remove these wrongly labeled instances from training stages. Besides, most of the traditional co-training approaches are implemented for two-view cases, and their extensions in multi-view scenarios are not intuitive. These issues not only degenerate their performance as well as available application range but also hamper their fundamental theory. Moreover, there is no optimization model to explain the objective a co-training process manages to optimize. To address these issues, in this study we design a unified self-paced multi-view co-training (SPamCo) framework which draws unlabeled instances with replacement. Two specified co-regularization terms are formulated to develop different strategies for selecting pseudo-labeled instances during training. Both forms share the same optimization strategy which is consistent with the iteration process in co-training and can be naturally extended to multi-view scenarios. A distributed optimization strategy is also introduced to train the classifier of each view in parallel to further improve the efficiency of the algorithm. Furthermore, the SPamCo algorithm is proved to be PAC learnable, supporting its theoretical soundness. Experiments conducted on synthetic, text categorization, person re-identification, image recognition and object detection data sets substantiate the superiority of the proposed method.
dc.language	en
dc.relation.ispartof	Journal of Machine Learning Research
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	08 Information and Computing Sciences, 17 Psychology and Cognitive Sciences
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Self-paced multi-view co-training
dc.type	Journal Article
utslib.citation.volume	21
utslib.for	08 Information and Computing Sciences
utslib.for	17 Psychology and Cognitive Sciences
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	open_access	*
pubs.consider-herdc	false
dc.date.updated	2021-03-16T01:02:21Z
pubs.publication-status	Published
pubs.volume	21

Abstract:

© 2020 Fan Ma, Deyu Meng, Xuanyi Dong and Yi Yang. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v21/18-794.html. Co-training is a well-known semi-supervised learning approach which trains classifiers on two or more different views and exchanges pseudo labels of unlabeled instances in an iterative way. During the co-training process, pseudo labels of unlabeled instances are very likely to be false especially in the initial training, while the standard co-training algorithm adopts a “draw without replacement” strategy and does not remove these wrongly labeled instances from training stages. Besides, most of the traditional co-training approaches are implemented for two-view cases, and their extensions in multi-view scenarios are not intuitive. These issues not only degenerate their performance as well as available application range but also hamper their fundamental theory. Moreover, there is no optimization model to explain the objective a co-training process manages to optimize. To address these issues, in this study we design a unified self-paced multi-view co-training (SPamCo) framework which draws unlabeled instances with replacement. Two specified co-regularization terms are formulated to develop different strategies for selecting pseudo-labeled instances during training. Both forms share the same optimization strategy which is consistent with the iteration process in co-training and can be naturally extended to multi-view scenarios. A distributed optimization strategy is also introduced to train the classifier of each view in parallel to further improve the efficiency of the algorithm. Furthermore, the SPamCo algorithm is proved to be PAC learnable, supporting its theoretical soundness. Experiments conducted on synthetic, text categorization, person re-identification, image recognition and object detection data sets substantiate the superiority of the proposed method.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/147218