Enhancing Mixture-of-Experts by Leveraging Attention for Fine-Grained Recognition

Zhang, L; Huang, S; Liu, W

Enhancing Mixture-of-Experts by Leveraging Attention for Fine-Grained Recognition

Zhang, L

Huang, S Liu, W

Permalink

Publisher:: Institute of Electrical and Electronics Engineers
Publication Type:: Journal Article
Citation:: IEEE Transactions on Multimedia, 2022, 24, pp. 4409-4421
Issue Date:: 2022-01-01

Closed Access

	Filename	Description	Size
	Enhancing_Mixture-of-Experts_by_Leveraging_Attention_for_Fine-Grained_Recognition.pdf	Published version	11.97 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zhang, L https://orcid.org/0000-0003-4509-2036
dc.contributor.author	Huang, S
dc.contributor.author	Liu, W
dc.date.accessioned	2023-06-12T05:54:55Z
dc.date.available	2023-06-12T05:54:55Z
dc.date.issued	2022-01-01
dc.identifier.citation	IEEE Transactions on Multimedia, 2022, 24, pp. 4409-4421
dc.identifier.issn	1520-9210
dc.identifier.issn	1941-0077
dc.identifier.uri	http://hdl.handle.net/10453/170707
dc.description.abstract	Differentiating subcategories of a common visual category is challenging because of the similar appearance shared among different classes in fine-grained recognition. Existing mixture-of-expert based methods divide the fine-grained space into some specific regions and solve the integrated problem by conquering subspace ones. However, it is not feasible to learn diverse experts directly through data partition strategy because of limited data available for fine-grained recognition problems. To address the issue, we leverage visual attention to learn an enhanced experts' mixture. Specifically, we introduce a gradually-enhanced learning strategy from model attention. The strategy promotes diversity among experts by feeding each expert with full-size data distinct in granularity. We further promote expert's learning by providing it with a larger data space, which is achieved by swapping attentive regions within positive pairs. Our method learns new experts on the dataset with the prior knowledge from former experts sequentially and enforces the experts to learn more diverse but discriminative representation. These enhanced experts are finally combined to make stronger predictions. We conduct extensive experiments on fine-grained benchmarks. The results show that our method consistently outperforms the state-of-the-art method in both weakly supervised localization and fine-grained image classification. Our code is publicly available at https://github.com/lbzhang/Enhanced-Expert-FGVC-Pytorch.git.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers
dc.relation.ispartof	IEEE Transactions on Multimedia
dc.relation.isbasedon	10.1109/TMM.2021.3117064
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	08 Information and Computing Sciences, 09 Engineering
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Enhancing Mixture-of-Experts by Leveraging Attention for Fine-Grained Recognition
dc.type	Journal Article
utslib.citation.volume	24
utslib.for	08 Information and Computing Sciences
utslib.for	09 Engineering
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2023-06-12T05:54:48Z
pubs.publication-status	Published
pubs.volume	24

Abstract:

Differentiating subcategories of a common visual category is challenging because of the similar appearance shared among different classes in fine-grained recognition. Existing mixture-of-expert based methods divide the fine-grained space into some specific regions and solve the integrated problem by conquering subspace ones. However, it is not feasible to learn diverse experts directly through data partition strategy because of limited data available for fine-grained recognition problems. To address the issue, we leverage visual attention to learn an enhanced experts' mixture. Specifically, we introduce a gradually-enhanced learning strategy from model attention. The strategy promotes diversity among experts by feeding each expert with full-size data distinct in granularity. We further promote expert's learning by providing it with a larger data space, which is achieved by swapping attentive regions within positive pairs. Our method learns new experts on the dataset with the prior knowledge from former experts sequentially and enforces the experts to learn more diverse but discriminative representation. These enhanced experts are finally combined to make stronger predictions. We conduct extensive experiments on fine-grained benchmarks. The results show that our method consistently outperforms the state-of-the-art method in both weakly supervised localization and fine-grained image classification. Our code is publicly available at https://github.com/lbzhang/Enhanced-Expert-FGVC-Pytorch.git.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/170707