A domain robust approach for image dataset construction
- Publication Type:
- Conference Proceeding
- MM 2016 - Proceedings of the 2016 ACM Multimedia Conference, 2016, pp. 212 - 216
- Issue Date:
© 2016 ACM. There have been increasing research interests in automatically constructing image dataset by collecting images from the Internet. However, existing methods tend to have a weak domain adaptation ability, known as the \dataset bias problem". To address this issue, in this work, we propose a novel image dataset construction framework which can generalize well to unseen target domains. In specific, the given queries are first expanded by searching in the Google Books Ngrams Corpora (GBNC) to obtain a richer semantic description, from which the noisy query expansions are then filtered out. By treating each expansion as a \bag" and the retrieved images therein as \instances", we formulate image filtering as a multi-instance learning (MIL) problem with constrained positive bags. By this approach, images from different data distributions will be kept while with noisy images filtered out. Comprehensive experiments on two challenging tasks demonstrate the effectiveness of our proposed approach.
Please use this identifier to cite or link to this item: