Image Co-saliency Detection and Co-segmentation from The Perspective of Commonalities

Publication Type:
Issue Date:
Full metadata record
Image co-saliency detection and image co-segmentation aim to identify the common salient objects and extract them in a group of images. Image co-saliency detection and image co-segmentation are important for many content-based applications such as image retrieval, image editing, and content aware image/video compression. The image co-saliency detection and image co-segmentation are very close works. The most important part in these two works is the definition of the commonality of the common objects. Usually, common objects share similar low-level features, such as appearances, including colours, textures shapes, etc. as well as the high-level semantic features. In this thesis, we explore the commonalities of the common objects in a group of images from low-level features and high-level features, and the way to achieve the commonalities and finally segment the common objects. Three main works are introduced, including an image co-saliency detection model and two image co-segmentation methods. π—™π—Άπ—Ώπ˜€π˜π—Ήπ˜†, an image co-saliency detection model based on region-level fusion and pixel-level refinement is proposed. The commonalities between the common objects are defined by the appearance similarities on the regions from all the images. It discovers the regions that are salient in each individual image as well as salient in the whole image group. Extensive experiments on two benchmark datasets demonstrate that the proposed co-saliency model consistently outperforms the state-of-the-art co-saliency models in both subjective and objective evaluation. π—¦π—²π—°π—Όπ—»π—±π—Ήπ˜†, an unsupervised images co-segmentation method via guidance of simple images is proposed. The commonalities are still defined by hand-crafted features on regions, colours and textures, but not calculated among regions from all the images. It takes advantages of the reliability of simple images, and successfully improves the performance. The experiments on the dataset demonstrate the outperformance and robustness of the proposed method. π—§π—΅π—Άπ—Ώπ—±π—Ήπ˜†, a learned image co-segmentation model based on convolutional neural network with multi-scale feature fusion is proposed. The commonalities between objects are not defined by handcraft features but learned from the training data. When training a neural network with multiple input images simultaneously, the resource cost will increase rapidly with the inputs. To reduce the resource cost, reduced input size, less downsampling and dilation convolution are adopted in the proposed model. Experimental results on the public dataset demonstrate that the proposed model achieves a comparable performance to the state-of-the-art methods while the network has successfully gotten simplified and the resources cost is reduced.
Please use this identifier to cite or link to this item: