Effective feature sets and dimensionality reduction for object classification

Publication Type:
Thesis
Issue Date:
2010
Filename Description Size
Thumbnail01front.pdfcontents and abstract6.77 MB
Adobe PDF
Thumbnail02whole.pdfthesis83.56 MB
Adobe PDF
Full metadata record
NO FULL TEXT AVAILABLE. Access is restricted indefinitely. ----- Recognition of visual objects into classes of interest is a long-explored area of computer vision and pattern recognition. The development of robust object recognition systems is important for a wide range of applications such as video surveillance, medical image analysis, face recognition, and many others. However, it is an extremely challenging task for computers to recognize objects as they can occur under different viewpoints, scale, illumination, occlusions, and background. The high variations among objects within the same category add to these challenges. For effective classification, it is necessary to have an effective feature set, for representing the objects, and a strong learning method, for classification. In this work, we first target the problem of recognition of abandoned objects in video surveillance systems. It is a difficult task where the challenges above become more evident and thus further research is needed to solve it. To this aim, we propose a novel feature set based on statistics of various features such as line segments, circles, corners, and global shape descriptors such as fitted ellipses and bounding boxes. We show the invariance of the proposed feature set to different data set types and learning algorithms. Moreover, to further prove the robustness of the proposed feature set, we compare it with other feature sets that are based on local regions (Scale Invariant Feature Transform (SIFT) keypoints). The classification results based on the proposed feature set achieved a 82.8% detection rate and a 5.7% false positives rate in classifying images of the four objects of interest: trolley, bag, person, and group of people. Moreover, classification based on the proposed feature set outperformed that based on SIFT keypoints by providing an average 23.8% higher detection rate and 7.9% lower false alarm rate. These results are promising considering the various challenges in a surveillance environment. Given the high dimensionality of the feature set used to represent the objects (44 dimensions) and because of the different complexity aspects associated with such a high dimensional space, in the second part of this thesis we propose a novel mixture model for reducing dimensionality: MLiT: Mixture of Gaussians under Linear Transformations. Each component in the mixture consists of a linear transformation (which is not restricted to be orthogonal) projecting the original data onto a subspace and a Gaussian distribution fitted on the projected data. Two methods are proposed for optimizing the model: the first method is based on a maximum-likelihood approach and the second is based on random projections. To validate the proposed model, we used it for maximum-likelihood classification of five “hard” data sets, including our video surveillance data set and four data sets from the UCI repository. We also compared the accuracy results of the proposed model with that of other popular classifiers. The accuracy achieved by the proposed method has outperformed that of other classifiers based on a similar classification approach (generative classifiers based on mixture models), in all cases but one, with improvements ranging from 0.2% to 5.2%. In order to further improve the classification performance of MLiT, we also propose BoostMLiT: Boosting Mixture of Gaussians under Linear Transformations. It integrates MLiT within the framework of AdaBoost, which is a widely applied method for boosting. In the cases where boosting has been feasible (i.e., the cases with low training error), the proposed method has proved effective in enhancing the performance of MLiT with improvements of up to 12.8%. In this work, we have contributed to resolving outstanding issues towards more accurate classification of objects in a video surveillance environment. Moreover, the developed mixture model for dimensionality reduction has proved successful for solving a number of challenging classification tasks. Further to the applications considered in this thesis, the proposed mixture model can be used for modelling densities in high dimensional spaces in a variety of other applications, including weighted maximum likelihood, hidden Markov models, and discrete state-space models in general.
Please use this identifier to cite or link to this item: