Bridging the Web Data and Fine-Grained Visual Recognition via Alleviating Label Noise and Domain Mismatch

Yao, Y; Hua, X; Gao, G; Sun, Z; Li, Z; Zhang, J

Bridging the Web Data and Fine-Grained Visual Recognition via Alleviating Label Noise and Domain Mismatch

Yao, Y Hua, X Gao, G Sun, Z Li, Z Zhang, J

Permalink

Publisher:: ACM
Publication Type:: Conference Proceeding
Citation:: Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 1735-1744
Issue Date:: 2020-10-12

Closed Access

	Filename	Description	Size
	3394171.3413851.pdf	Published Version	3.42 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Yao, Y
dc.contributor.author	Hua, X
dc.contributor.author	Gao, G
dc.contributor.author	Sun, Z
dc.contributor.author	Li, Z
dc.contributor.author	Zhang, J https://orcid.org/0000-0002-7240-3541
dc.date	2020-10-12
dc.date.accessioned	2021-04-11T07:05:10Z
dc.date.available	2021-04-11T07:05:10Z
dc.date.issued	2020-10-12
dc.identifier.citation	Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 1735-1744
dc.identifier.isbn	9781450379885
dc.identifier.uri	http://hdl.handle.net/10453/147991
dc.description.abstract	To distinguish the subtle differences among fine-grained categories, a large amount of well-labeled images are typically required. However, manual annotations for fine-grained categories is an extremely difficult task as it usually has a high demand for professional knowledge. To this end, we propose to directly leverage web images for fine-grained visual recognition. Our work mainly focuses on two critical issues including "label noise" and "domain mismatch" in the web images. Specifically, we propose an end-to-end deep denoising network (DDN) model to jointly solve these problems in the process of web images selection. To verify the effectiveness of our proposed approach, we first collect web images by using the labels in fine-grained datasets. Then we apply the proposed deep denoising network model for noise removal and domain mismatch alleviation. We leverage the selected web images as the training set for fine-grained categorization models learning. Extensive experiments and ablation studies demonstrate state-of-the-art performance gained by our proposed approach, which, at the same time, delivers a new pipeline for fine-grained visual categorization that is to be highly effective for real-world applications.
dc.language	en
dc.publisher	ACM
dc.relation.ispartof	Proceedings of the 28th ACM International Conference on Multimedia
dc.relation.ispartof	ACM International Conference on Multimedia
dc.relation.isbasedon	10.1145/3394171.3413851
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Bridging the Web Data and Fine-Grained Visual Recognition via Alleviating Label Noise and Domain Mismatch
dc.type	Conference Proceeding
utslib.location.activity	Virtual
utslib.for	0806 Information Systems
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2021-04-11T07:04:40Z
pubs.finish-date	2020-10-16
pubs.place-of-publication	USA
pubs.publication-status	Published
pubs.start-date	2020-10-12
dc.location	USA

Abstract:

To distinguish the subtle differences among fine-grained categories, a large amount of well-labeled images are typically required. However, manual annotations for fine-grained categories is an extremely difficult task as it usually has a high demand for professional knowledge. To this end, we propose to directly leverage web images for fine-grained visual recognition. Our work mainly focuses on two critical issues including "label noise" and "domain mismatch" in the web images. Specifically, we propose an end-to-end deep denoising network (DDN) model to jointly solve these problems in the process of web images selection. To verify the effectiveness of our proposed approach, we first collect web images by using the labels in fine-grained datasets. Then we apply the proposed deep denoising network model for noise removal and domain mismatch alleviation. We leverage the selected web images as the training set for fine-grained categorization models learning. Extensive experiments and ablation studies demonstrate state-of-the-art performance gained by our proposed approach, which, at the same time, delivers a new pipeline for fine-grained visual categorization that is to be highly effective for real-world applications.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/147991