Learning for Object Localization with Imperfect Data

Publication Type:
Thesis
Issue Date:
2021
Full metadata record
Deep learning has achieved countless remarkable successes in recent years. Learning deep neural networks usually needs tremendous well-labeled examples, which requires intensive investments. A feasible solution for reducing the budget is to learn from imperfect data, e.g., noisy data, synthetic data, weak labels, and datasets with few annotated examples. This thesis dedicates to the weakly supervised learning and few-shot learning. The first task is to address the challenging object localization problem using weak annotations as supervision. Objects in images are expected to be precisely located with only image-level labels, i.e., category information. Specifically, convolutional networks can only find the most discriminative object regions leading to the unsatisfied predictions of bounding boxes. This thesis tries to solve this problem in three perspectives: 1) forcing the networks to mine more object areas by erasing the discovered object pixels; 2) learning pixel correlations within images under the supervision of self-produced object masks ; 3) communicating with different images to obtain more consistent features, and therefore, activating target object more accurately. The second task is to predict the semantic masks of objects in a few-shot approach. Finding every pixel of target objects can also be considered as the most delicate localization problem. In the few-shot regime, only few annotated examples are available for an unseen class, and networks are required to locate the semantic category of each pixel with minimal information. This thesis will present two approaches to improve the quality of predicted object masks. Notably, a similarity-guided network is proposed to endow the segmentation process with rough position cues for locating the object pixels. To enhance the guidance process and improve the robustness, we further enrich the guidance embeddings and propose to employ multiple diverse support vectors to generate the similarity maps. In addition, each of the proposed methods is comprehensively verified and analyzed by conducting various experiments.
Please use this identifier to cite or link to this item: