One-Shot Learning-Based Animal Video Segmentation

Publisher:
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:
Journal Article
Citation:
IEEE Transactions on Industrial Informatics, 2022, 18, (6), pp. 3799-3807
Issue Date:
2022-06-01
Filename Description Size
One-Shot_Learning-Based_Animal_Video_Segmentation.pdfPublished version4.27 MB
Adobe PDF
Full metadata record
Deep learning-based video segmentation methods can offer a good performance after being trained on the large-scale pixel labeled datasets. However, a pixel-wise manual labeling of animal images is challenging and time consuming due to irregular contours and motion blur. To achieve desirable tradeoffs between the accuracy and speed, a novel one-shot learning-based approach is proposed in this article to segment animal video with only one labeled frame. The proposed approach consists of the following three main modules: guidance frame selection utilizes 'BubbleNet' to choose one frame for manual labeling, which can leverage the fine-tuning effects of the only labeled frame; Xception-based fully convolutional network localizes dense prediction using depthwise separable convolutions based on one single labeled frame; and postprocessing is used to remove outliers and sharpen object contours, which consists of two submodules-test time augmentation and conditional random field. Extensive experiments have been conducted on the DAVIS 2016 animal dataset. Our proposed video segmentation approach achieved mean intersection-over-union score of 89.5% on the DAVIS 2016 animal dataset with less run time, and outperformed the state-of-art methods (OSVOS and OSMN). The proposed one-shot learning-based approach achieves real-time and automatic segmentation of animals with only one labeled video frame. This can be potentially used further as a baseline for intelligent perception-based monitoring of animals and other domain-specific applications1.
Please use this identifier to cite or link to this item: