Effective feature sets and dimensionality reduction for object classification

Otoom, AF

Effective feature sets and dimensionality reduction for object classification

Otoom, AF

Permalink

Publication Type:: Thesis
Issue Date:: 2010

Closed Access

	Filename	Description	Size
	01front.pdf	contents and abstract	6.77 MB	Adobe PDF	View/Open
	02whole.pdf	thesis	83.56 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Otoom, AF
dc.date.accessioned	2015-07-14T03:00:10Z
dc.date.available	2015-07-14T03:00:10Z
dc.date.issued	2010
dc.identifier.uri	http://hdl.handle.net/10453/36453
dc.description	University of Technology, Sydney. Faculty of Engineering and Information Technology.	en_US
dc.description	NO FULL TEXT AVAILABLE. Access is restricted indefinitely. The hardcopy may be available for consultation at the UTS Library.
dc.description.abstract	NO FULL TEXT AVAILABLE. Access is restricted indefinitely. ----- Recognition of visual objects into classes of interest is a long-explored area of computer vision and pattern recognition. The development of robust object recognition systems is important for a wide range of applications such as video surveillance, medical image analysis, face recognition, and many others. However, it is an extremely challenging task for computers to recognize objects as they can occur under different viewpoints, scale, illumination, occlusions, and background. The high variations among objects within the same category add to these challenges. For effective classification, it is necessary to have an effective feature set, for representing the objects, and a strong learning method, for classification. In this work, we first target the problem of recognition of abandoned objects in video surveillance systems. It is a difficult task where the challenges above become more evident and thus further research is needed to solve it. To this aim, we propose a novel feature set based on statistics of various features such as line segments, circles, corners, and global shape descriptors such as fitted ellipses and bounding boxes. We show the invariance of the proposed feature set to different data set types and learning algorithms. Moreover, to further prove the robustness of the proposed feature set, we compare it with other feature sets that are based on local regions (Scale Invariant Feature Transform (SIFT) keypoints). The classification results based on the proposed feature set achieved a 82.8% detection rate and a 5.7% false positives rate in classifying images of the four objects of interest: trolley, bag, person, and group of people. Moreover, classification based on the proposed feature set outperformed that based on SIFT keypoints by providing an average 23.8% higher detection rate and 7.9% lower false alarm rate. These results are promising considering the various challenges in a surveillance environment. Given the high dimensionality of the feature set used to represent the objects (44 dimensions) and because of the different complexity aspects associated with such a high dimensional space, in the second part of this thesis we propose a novel mixture model for reducing dimensionality: MLiT: Mixture of Gaussians under Linear Transformations. Each component in the mixture consists of a linear transformation (which is not restricted to be orthogonal) projecting the original data onto a subspace and a Gaussian distribution fitted on the projected data. Two methods are proposed for optimizing the model: the first method is based on a maximum-likelihood approach and the second is based on random projections. To validate the proposed model, we used it for maximum-likelihood classification of five “hard” data sets, including our video surveillance data set and four data sets from the UCI repository. We also compared the accuracy results of the proposed model with that of other popular classifiers. The accuracy achieved by the proposed method has outperformed that of other classifiers based on a similar classification approach (generative classifiers based on mixture models), in all cases but one, with improvements ranging from 0.2% to 5.2%. In order to further improve the classification performance of MLiT, we also propose BoostMLiT: Boosting Mixture of Gaussians under Linear Transformations. It integrates MLiT within the framework of AdaBoost, which is a widely applied method for boosting. In the cases where boosting has been feasible (i.e., the cases with low training error), the proposed method has proved effective in enhancing the performance of MLiT with improvements of up to 12.8%. In this work, we have contributed to resolving outstanding issues towards more accurate classification of objects in a video surveillance environment. Moreover, the developed mixture model for dimensionality reduction has proved successful for solving a number of challenging classification tasks. Further to the applications considered in this thesis, the proposed mixture model can be used for modelling densities in high dimensional spaces in a variety of other applications, including weighted maximum likelihood, hidden Markov models, and discrete state-space models in general.	en_US
dc.format	Thesis (PhD)	en_US
dc.language.iso	en	en_US
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Effective feature sets and dimensionality reduction for object classification	en_US
dc.type	Thesis
utslib.copyright.status	closed_access

Abstract:

NO FULL TEXT AVAILABLE. Access is restricted indefinitely. ----- Recognition of visual objects into classes of interest is a long-explored area of computer vision and pattern recognition. The development of robust object recognition systems is important for a wide range of applications such as video surveillance, medical image analysis, face recognition, and many others. However, it is an extremely challenging task for computers to recognize objects as they can occur under different viewpoints, scale, illumination, occlusions, and background. The high variations among objects within the same category add to these challenges. For effective classification, it is necessary to have an effective feature set, for representing the objects, and a strong learning method, for classification. In this work, we first target the problem of recognition of abandoned objects in video surveillance systems. It is a difficult task where the challenges above become more evident and thus further research is needed to solve it. To this aim, we propose a novel feature set based on statistics of various features such as line segments, circles, corners, and global shape descriptors such as fitted ellipses and bounding boxes. We show the invariance of the proposed feature set to different data set types and learning algorithms. Moreover, to further prove the robustness of the proposed feature set, we compare it with other feature sets that are based on local regions (Scale Invariant Feature Transform (SIFT) keypoints). The classification results based on the proposed feature set achieved a 82.8% detection rate and a 5.7% false positives rate in classifying images of the four objects of interest: trolley, bag, person, and group of people. Moreover, classification based on the proposed feature set outperformed that based on SIFT keypoints by providing an average 23.8% higher detection rate and 7.9% lower false alarm rate. These results are promising considering the various challenges in a surveillance environment. Given the high dimensionality of the feature set used to represent the objects (44 dimensions) and because of the different complexity aspects associated with such a high dimensional space, in the second part of this thesis we propose a novel mixture model for reducing dimensionality: MLiT: Mixture of Gaussians under Linear Transformations. Each component in the mixture consists of a linear transformation (which is not restricted to be orthogonal) projecting the original data onto a subspace and a Gaussian distribution fitted on the projected data. Two methods are proposed for optimizing the model: the first method is based on a maximum-likelihood approach and the second is based on random projections. To validate the proposed model, we used it for maximum-likelihood classification of five “hard” data sets, including our video surveillance data set and four data sets from the UCI repository. We also compared the accuracy results of the proposed model with that of other popular classifiers. The accuracy achieved by the proposed method has outperformed that of other classifiers based on a similar classification approach (generative classifiers based on mixture models), in all cases but one, with improvements ranging from 0.2% to 5.2%. In order to further improve the classification performance of MLiT, we also propose BoostMLiT: Boosting Mixture of Gaussians under Linear Transformations. It integrates MLiT within the framework of AdaBoost, which is a widely applied method for boosting. In the cases where boosting has been feasible (i.e., the cases with low training error), the proposed method has proved effective in enhancing the performance of MLiT with improvements of up to 12.8%. In this work, we have contributed to resolving outstanding issues towards more accurate classification of objects in a video surveillance environment. Moreover, the developed mixture model for dimensionality reduction has proved successful for solving a number of challenging classification tasks. Further to the applications considered in this thesis, the proposed mixture model can be used for modelling densities in high dimensional spaces in a variety of other applications, including weighted maximum likelihood, hidden Markov models, and discrete state-space models in general.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/36453