Taming Overconfident Prediction on Unlabeled Data From Hindsight.

Li, J; Pan, Y; Tsang, IW

Taming Overconfident Prediction on Unlabeled Data From Hindsight.

Li, J Pan, Y Tsang, IW

Permalink

Publisher:: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:: Journal Article
Citation:: IEEE Trans Neural Netw Learn Syst, 2023, PP, (99)
Issue Date:: 2023-05-23

Closed Access

	Filename	Description	Size
	Taming_Overconfident_Prediction_on_Unlabeled_Data_From_Hindsight.pdf	Accepted version	974.26 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Li, J
dc.contributor.author	Pan, Y
dc.contributor.author	Tsang, IW
dc.date.accessioned	2024-03-04T01:52:34Z
dc.date.available	2024-03-04T01:52:34Z
dc.date.issued	2023-05-23
dc.identifier.citation	IEEE Trans Neural Netw Learn Syst, 2023, PP, (99)
dc.identifier.issn	2162-237X
dc.identifier.issn	2162-2388
dc.identifier.uri	http://hdl.handle.net/10453/176046
dc.description.abstract	Minimizing prediction uncertainty on unlabeled data is a key factor to achieve good performance in semi-supervised learning (SSL). The prediction uncertainty is typically expressed as the entropy computed by the transformed probabilities in output space. Most existing works distill low-entropy prediction by either accepting the determining class (with the largest probability) as the true label or suppressing subtle predictions (with the smaller probabilities). Unarguably, these distillation strategies are usually heuristic and less informative for model training. From this discernment, this article proposes a dual mechanism, named adaptive sharpening (ADS), which first applies a soft-threshold to adaptively mask out determinate and negligible predictions, and then seamlessly sharpens the informed predictions, distilling certain predictions with the informed ones only. More importantly, we theoretically analyze the traits of ADS by comparing it with various distillation strategies. Numerous experiments verify that ADS significantly improves state-of-the-art SSL methods by making it a plug-in. Our proposed ADS forges a cornerstone for future distillation-based SSL research.
dc.format	Print-Electronic
dc.language	eng
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.relation.ispartof	IEEE Trans Neural Netw Learn Syst
dc.relation.isbasedon	10.1109/TNNLS.2023.3274845
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Taming Overconfident Prediction on Unlabeled Data From Hindsight.
dc.type	Journal Article
utslib.citation.volume	PP
utslib.location.activity	United States
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
utslib.copyright.status	closed_access	*
dc.date.updated	2024-03-04T01:52:30Z
pubs.issue	99
pubs.publication-status	Published online
pubs.volume	PP
utslib.citation.issue	99

Abstract:

Minimizing prediction uncertainty on unlabeled data is a key factor to achieve good performance in semi-supervised learning (SSL). The prediction uncertainty is typically expressed as the entropy computed by the transformed probabilities in output space. Most existing works distill low-entropy prediction by either accepting the determining class (with the largest probability) as the true label or suppressing subtle predictions (with the smaller probabilities). Unarguably, these distillation strategies are usually heuristic and less informative for model training. From this discernment, this article proposes a dual mechanism, named adaptive sharpening (ADS), which first applies a soft-threshold to adaptively mask out determinate and negligible predictions, and then seamlessly sharpens the informed predictions, distilling certain predictions with the informed ones only. More importantly, we theoretically analyze the traits of ADS by comparing it with various distillation strategies. Numerous experiments verify that ADS significantly improves state-of-the-art SSL methods by making it a plug-in. Our proposed ADS forges a cornerstone for future distillation-based SSL research.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/176046