Discovering and Distinguishing Multiple Visual Senses for Web Learning
- Publication Type:
- Journal Article
- IEEE Transactions on Multimedia, 2018
- Issue Date:
IEEE Labeled image datasets have played a critical role in high-level image understanding. However, the process of manual labeling is both time-consuming and labor-intensive. To reduce the dependence on manually labeled data, there have been increasing research efforts on learning visual classifiers by directly exploiting web images. One issue that limits their performance is the problem of polysemy. Existing unsupervised approaches attempt to reduce the influence of visual polysemy by filtering out irrelevant images, but do not directly address polysemy. To this end, in this work, we present a multimodal framework that solves the problem of polysemy by allowing sense-specific diversity in search results. Specifically, we first discover a list of possible semantic senses from untagged corpora to retrieve sense-specific images. Then we merge visual similar semantic senses and prune noise by using the retrieved images. Finally, we train one visual classifier for each selected semantic sense and use the learned sense-specific classifiers to distinguish multiple visual senses. Extensive experiments on classifying images into sense-specific categories and re-ranking search results demonstrate the superiority of our proposed approach.
Please use this identifier to cite or link to this item: