Small Object Detection and Recognition Using Context and Representation Learning

Xi, Yue

Small Object Detection and Recognition Using Context and Representation Learning

Xi, Yue

Permalink

Publication Type:: Thesis
Issue Date:: 2021

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download contents and abstractAdobe PDF (188.55 kB)

Adobe PDF

Download thesisAdobe PDF (22.9 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Xi, Yue
dc.date.accessioned	2021-05-13T04:59:37Z
dc.date.available	2021-05-13T04:59:37Z
dc.date.issued	2021
dc.identifier.uri	http://hdl.handle.net/10453/148886
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_US.UTF-8
dc.description.abstract	Small object detection and recognition is very common in real world applications, such as remote sensing images analysis for Earth Vision, Unmanned Aerial Vehicle vision and video surveillance for identity recognition. Recently, the existing methods have achieved impressive results on large and medium objects. But the detection and recognition performance for small or even tiny objects is still far from satisfaction. The problem is highly challenging because small objects in low-resolution images may contain fewer than a hundred pixels, and lack sufficient details. Context plays an important role on small object detection and recognition. Aiming to boost the detection performance, we propose a novel discriminative learning and graph-cut framework to exploit the semantic information between targeting objects’ neighbours. What is more, to depict a local neighbourhood relationship, we introduce a pairwise constraint into a tiny face detector to improve the detection accuracy. At last, to describe such a constraint, we convert the problem of regression that estimates the similarity between different candidates into a classification problem that produces the score of classification for each pair of candidates. In representation learning, we propose an RL-GAN architecture, which enhances the discriminability of the low-resolution (LR) image representation, resulting in comparable classification performance with that conducted on high-resolution (HR) images. In addition, we propose a method based on a Residual Representation to generate a more effective representation of LR images. The Residual Representation is adapted to fuel back the lost details in the representation space of LR images. At last, we produce a new dataset WIDER-SHIP, which provides paired images of multiple resolutions of ships in satellite images and can be used to evaluate not only LR image classification but also LR object recognition. In the domain of a small sample training, we explore a novel data augmentation framework, which extends a training set to achieve a better coverage of varying orientations of objects in a testing data, so as to improve the performance of CNNs for object detection. Then, we design a principal-axis orientation descriptor based on super-pixel segmentation to represent the orientation of an object in an image. We propose a similarity measure method of two datasets based on a principal-axis orientation distribution. We evaluate the performance and show the effectivity of CNNs for object detection with and without rotating images in the testing set. Dissertation is directed by Professor Xiangjian He and DoctorWenjing Jia of University of Technology Sydney, Australia, and Professor Jiangbin Zheng of Northwestern Polytechnical University, China.	en_US.UTF-8
dc.format	Thesis (PhD)
dc.language.iso	en	en_US.UTF-8
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/148886/2/02whole.pdf
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	au.edu.uts.lib/ppc
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Small Object Detection and Recognition Using Context and Representation Learning	en_US.UTF-8
dc.type	Thesis
utslib.copyright.status	open_access	*

Abstract:

Small object detection and recognition is very common in real world applications, such as remote sensing images analysis for Earth Vision, Unmanned Aerial Vehicle vision and video surveillance for identity recognition. Recently, the existing methods have achieved impressive results on large and medium objects. But the detection and recognition performance for small or even tiny objects is still far from satisfaction. The problem is highly challenging because small objects in low-resolution images may contain fewer than a hundred pixels, and lack sufficient details. Context plays an important role on small object detection and recognition. Aiming to boost the detection performance, we propose a novel discriminative learning and graph-cut framework to exploit the semantic information between targeting objects’ neighbours. What is more, to depict a local neighbourhood relationship, we introduce a pairwise constraint into a tiny face detector to improve the detection accuracy. At last, to describe such a constraint, we convert the problem of regression that estimates the similarity between different candidates into a classification problem that produces the score of classification for each pair of candidates. In representation learning, we propose an RL-GAN architecture, which enhances the discriminability of the low-resolution (LR) image representation, resulting in comparable classification performance with that conducted on high-resolution (HR) images. In addition, we propose a method based on a Residual Representation to generate a more effective representation of LR images. The Residual Representation is adapted to fuel back the lost details in the representation space of LR images. At last, we produce a new dataset WIDER-SHIP, which provides paired images of multiple resolutions of ships in satellite images and can be used to evaluate not only LR image classification but also LR object recognition. In the domain of a small sample training, we explore a novel data augmentation framework, which extends a training set to achieve a better coverage of varying orientations of objects in a testing data, so as to improve the performance of CNNs for object detection. Then, we design a principal-axis orientation descriptor based on super-pixel segmentation to represent the orientation of an object in an image. We propose a similarity measure method of two datasets based on a principal-axis orientation distribution. We evaluate the performance and show the effectivity of CNNs for object detection with and without rotating images in the testing set. Dissertation is directed by Professor Xiangjian He and DoctorWenjing Jia of University of Technology Sydney, Australia, and Professor Jiangbin Zheng of Northwestern Polytechnical University, China.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/148886