Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation

Dang, Z; Luo, M; Jia, C; Dai, G; Chang, X; Wang, J

Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation

Dang, Z Luo, M Jia, C Dai, G Chang, X

Wang, J

Permalink

Publisher:: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE
Publication Type:: Conference Proceeding
Citation:: Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38, (2), pp. 1463-1471
Issue Date:: 2024-03-25

Closed Access

	Filename	Description	Size
	2312.16478v1.pdf	Published version	5.96 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Dang, Z
dc.contributor.author	Luo, M
dc.contributor.author	Jia, C
dc.contributor.author	Dai, G
dc.contributor.author	Chang, X https://orcid.org/0000-0002-7778-8807
dc.contributor.author	Wang, J
dc.contributor.editor	Wooldridge, M
dc.contributor.editor	Dy, J
dc.contributor.editor	Natarajan, S
dc.date	2024-02-20
dc.date.accessioned	2025-01-21T01:31:20Z
dc.date.available	2025-01-21T01:31:20Z
dc.date.issued	2024-03-25
dc.identifier.citation	Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38, (2), pp. 1463-1471
dc.identifier.issn	2159-5399
dc.identifier.issn	2374-3468
dc.identifier.uri	http://hdl.handle.net/10453/183905
dc.description.abstract	Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. Recently, to alleviate expensive data collection, co-occurring pairs from the Internet are automatically harvested for training. However, it inevitably includes mismatched pairs, i.e., noisy correspondences, undermining supervision reliability and degrading performance. Current methods leverage deep neural networks’ memorization effect to address noisy correspondences, which overconfidently focus on similarity-guided training with hard negatives and suffer from self-reinforcing errors. In light of above, we introduce a novel noisy correspondence learning framework, namely Self-Reinforcing Errors Mitigation (SREM). Specifically, by viewing sample matching as classification tasks within the batch, we generate classification logits for the given sample. Instead of a single similarity score, we refine sample filtration through energy uncertainty and estimate model’s sensitivity of selected clean samples using swapped classification entropy, in view of the overall prediction distribution. Additionally, we propose cross-modal biased complementary learning to leverage negative matches overlooked in hard-negative training, further improving model optimization stability and curbing self-reinforcing errors. Extensive experiments on challenging benchmarks affirm the efficacy and efficiency of SREM.
dc.language	en
dc.publisher	ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE
dc.relation.ispartof	Proceedings of the AAAI Conference on Artificial Intelligence
dc.relation.ispartof	38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence
dc.relation.ispartofseries	AAAI Conference on Artificial Intelligence
dc.relation.isbasedon	10.1609/aaai.v38i2.27911
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation
dc.type	Conference Proceeding
utslib.citation.volume	38
utslib.location.activity	CANADA, Vancouver
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/UTS Groups
pubs.organisational-group	University of Technology Sydney/UTS Groups/Australian Artificial Intelligence Institute (AAII)
utslib.copyright.status	closed_access	*
dc.date.updated	2025-01-21T01:31:18Z
pubs.finish-date	2024-02-27
pubs.issue	2
pubs.publication-status	Published
pubs.start-date	2024-02-20
pubs.volume	38
utslib.citation.issue	2

Abstract:

Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. Recently, to alleviate expensive data collection, co-occurring pairs from the Internet are automatically harvested for training. However, it inevitably includes mismatched pairs, i.e., noisy correspondences, undermining supervision reliability and degrading performance. Current methods leverage deep neural networks’ memorization effect to address noisy correspondences, which overconfidently focus on similarity-guided training with hard negatives and suffer from self-reinforcing errors. In light of above, we introduce a novel noisy correspondence learning framework, namely Self-Reinforcing Errors Mitigation (SREM). Specifically, by viewing sample matching as classification tasks within the batch, we generate classification logits for the given sample. Instead of a single similarity score, we refine sample filtration through energy uncertainty and estimate model’s sensitivity of selected clean samples using swapped classification entropy, in view of the overall prediction distribution. Additionally, we propose cross-modal biased complementary learning to leverage negative matches overlooked in hard-negative training, further improving model optimization stability and curbing self-reinforcing errors. Extensive experiments on challenging benchmarks affirm the efficacy and efficiency of SREM.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/183905