Multi-Order Adversarial Representation Learning for Composed Query Image Retrieval

Publisher:
IEEE
Publication Type:
Conference Proceeding
Citation:
ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, 00, pp. 1685-1689
Issue Date:
2021-05-13
Full metadata record
This paper targets at a task of composed query image retrieval. Given a composed query consists of a reference image and modification text, the task aims to retrieve images which are generally similar to the reference image but differ according to the given modification text. The task is challenging, due to the complexity of the composed query and cross-modality characteristics between the query and candidate images. The common paradigm for the task is to first obtain fused feature of the reference image and the text, and further project them into a common embedding space with candidate images. However, the majority of works usually only aim for the representation of high level, ignoring the low-level representation which may be complementary to the high-level representation. So this paper proposes a new Multi-order Adversarial Network (MAN) which uses multilevel representations and simultaneously explores their low-order and high-order interactions, obtaining low-order and high-order features. The low-order features reflect the pattern of itself and high-order features contains the interaction between features. Moreover, we further introduce an adversarial module to constrain the fusion of the reference image and the text. Extensive experiments on three datasets verify the effectiveness of our MAN and also demonstrate its state-of-the-art performance.
Please use this identifier to cite or link to this item: