Single and multiple instance learning for visual categorisation

Du, R

Single and multiple instance learning for visual categorisation

Du, R

Permalink

Publication Type:: Thesis
Issue Date:: 2013

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download contents and abstractAdobe PDF (147.16 kB)

Adobe PDF

Download thesisAdobe PDF (15.89 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Du, R
dc.date.accessioned	2013-09-04T06:46:04Z
dc.date.available	2013-09-04T06:46:04Z
dc.date.issued	2013
dc.identifier.uri	http://hdl.handle.net/10453/23495
dc.description	University of Technology, Sydney. Faculty of Engineering and Information Technology.	en_US
dc.description.abstract	Nowadays, huge amounts of visual data, e.g., videos and images, have become widely accessible. Therefore, intelligently categorizing the large and growing collections of data for access convenience has been a central goal for modern computer vision research. In this thesis, we describe several newly-developed approaches for visual categorization upon the single and multiple instance learning cases. In single-instance learning (SIL), each of the training instances has been labeled. Here, we focus on a challenging task of facial expressions recognition where manually labeling each training instance, i.e., face video, is handy. To get the distinct features of expressions, we propose a novel feature representation, Histogram Variances Face (HVF), which integrates dynamic expression information into a static image being invariant to illumination and in-plane rotation. Through HVFs, the facial expression recognition can be cast as a facial recognition problem. We have applied our approach on the well-known Cohn-Kanade AU-Coded Facial Expression database, and then those extracted HVFs are classified by using facial recognition technology, i.e., Eigenfaces and Support Vector Machines (SVMs). The recognition accuracy is very encouraging. We further propose an extension of HVFs, Hexagonal Histogram Variance Faces (HHVFs), which applies HVFs on a hexagonal structure. Comparing to HVFs, HHVFs not only greatly reduce the computation costs but also improve the recognition accuracy. In multiple-instance learning (MIL), the training instances are divided into groups and the instances in the same group share only one label. MIL arises from many applications where individually labeling training instances is expensive. In this case, we propose a novel algorithm, multiple-instance learning with a supervised kernel density estimation (MIL-SKDE), to tackle the labeling ambiguity. Our algorithm extends the twin technologies, kernel density estimation (SKDE) and mean shift, to their supervised versions in which the labels of data points will affect the mode seeking. We apply MIL-SKDE in several applications of visual categorization, e.g., image and object categorization, and our algorithm performs superiorly comparing to other state-of-the-art methods. Furthermore, to address the complexity issue of MIL-SKDE, we propose MIL-SS (MIL with speed-up SKDE) to speed up the training process. Experiments shows that it has comparable performances to MIL-SKDE but is much more efficient in training stage. Finally, we apply MIL-SS in a “bag-of-words” (BoW) system to learn the visual codebook for object categorization on a more comprehensive dataset. Our system consists of four steps: codebook generation, feature coding, feature pooling and classification. Unlike conventional BoW methods that learn codebook from the whole image areas, our method can learn codebook just from the areas of target objects, which significantly improves classification accuracy.	en_US
dc.format	Thesis (PhD)	en_US
dc.language.iso	en	en_US
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/23495/11/02whole.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	au.edu.uts.lib/ppc
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.subject	Human face recognition.	en
dc.subject	Facial pattern recognition.	en
dc.subject	Computer vision research.	en
dc.title	Single and multiple instance learning for visual categorisation	en_US
dc.type	Thesis
utslib.copyright.status	open_access

Abstract:

Nowadays, huge amounts of visual data, e.g., videos and images, have become widely accessible. Therefore, intelligently categorizing the large and growing collections of data for access convenience has been a central goal for modern computer vision research. In this thesis, we describe several newly-developed approaches for visual categorization upon the single and multiple instance learning cases. In single-instance learning (SIL), each of the training instances has been labeled. Here, we focus on a challenging task of facial expressions recognition where manually labeling each training instance, i.e., face video, is handy. To get the distinct features of expressions, we propose a novel feature representation, Histogram Variances Face (HVF), which integrates dynamic expression information into a static image being invariant to illumination and in-plane rotation. Through HVFs, the facial expression recognition can be cast as a facial recognition problem. We have applied our approach on the well-known Cohn-Kanade AU-Coded Facial Expression database, and then those extracted HVFs are classified by using facial recognition technology, i.e., Eigenfaces and Support Vector Machines (SVMs). The recognition accuracy is very encouraging. We further propose an extension of HVFs, Hexagonal Histogram Variance Faces (HHVFs), which applies HVFs on a hexagonal structure. Comparing to HVFs, HHVFs not only greatly reduce the computation costs but also improve the recognition accuracy. In multiple-instance learning (MIL), the training instances are divided into groups and the instances in the same group share only one label. MIL arises from many applications where individually labeling training instances is expensive. In this case, we propose a novel algorithm, multiple-instance learning with a supervised kernel density estimation (MIL-SKDE), to tackle the labeling ambiguity. Our algorithm extends the twin technologies, kernel density estimation (SKDE) and mean shift, to their supervised versions in which the labels of data points will affect the mode seeking. We apply MIL-SKDE in several applications of visual categorization, e.g., image and object categorization, and our algorithm performs superiorly comparing to other state-of-the-art methods. Furthermore, to address the complexity issue of MIL-SKDE, we propose MIL-SS (MIL with speed-up SKDE) to speed up the training process. Experiments shows that it has comparable performances to MIL-SKDE but is much more efficient in training stage. Finally, we apply MIL-SS in a “bag-of-words” (BoW) system to learn the visual codebook for object categorization on a more comprehensive dataset. Our system consists of four steps: codebook generation, feature coding, feature pooling and classification. Unlike conventional BoW methods that learn codebook from the whole image areas, our method can learn codebook just from the areas of target objects, which significantly improves classification accuracy.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/23495