Learn to focus on objects for visual detection

Publication Type:
Journal Article
Neurocomputing, 2019, 348 pp. 27 - 39
Issue Date:
Filename Description Size
1-s2.0-S0925231218312785-main.pdfPublished Version4.19 MB
Adobe PDF
Full metadata record
© 2018 State-of-art visual detectors utilize object proposals as the reference of objects to achieve higher efficiency. However, the number of the proposal to ensure full coverage of potential objects is still large because the proposals are generated with thread and thrum, exposing proposal computation as a bottleneck. This paper presents a complementary technique that aims to work with any existing proposal generating system, amending the work-flow from “propose-assess” to “propose-adjust-assess”. Inspired by the biological processing, we propose to improve the quality of object proposals by analyzing visual contexts and gradually focusing proposals on targets. In particular, the proposed method can be employed with existing proposals generation algorithms based on both hand-crafted features and Convolutional Neural Network (CNN) features. For the former, we realize the focusing function by two learning-based transformation models, which are trained for identifying generic objects using image cues. For the latter, a Focus Proposal Net (FoPN) with cascaded layers, which can be directly injected into CNN models in an end-to-end manner, is developed as the implementation of focusing operation. Experiments on real-life image data sets demonstrate that the quality of the proposal is improved by the proposed technique. Besides, it can reduce the number of proposals to achieve high recall rate of the objects based on both hand-crafted features and CNN-features, and can boost the performance of state-of-art detectors.
Please use this identifier to cite or link to this item: