An End-to-End Hierarchical Classification Approach for Similar Gesture Recognition

Wu, D; Sharma, N; Blumenstein, M

An End-to-End Hierarchical Classification Approach for Similar Gesture Recognition

Wu, D Sharma, N

Blumenstein, M

Permalink

Publication Type:: Conference Proceeding
Citation:: International Conference Image and Vision Computing New Zealand, 2019, 2018-November
Issue Date:: 2019-02-04

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Download Accepted ManuscriptAdobe PDF (283.42 kB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Wu, D	en_US
dc.contributor.author	Sharma, N https://orcid.org/0000-0003-0841-1245	en_US
dc.contributor.author	Blumenstein, M https://orcid.org/0000-0002-9908-3744	en_US
dc.date.available	2021-02-05T18:06:49Z
dc.date.issued	2019-02-04	en_US
dc.identifier.citation	International Conference Image and Vision Computing New Zealand, 2019, 2018-November	en_US
dc.identifier.isbn	9781728101255	en_US
dc.identifier.issn	2151-2191	en_US
dc.identifier.uri	http://hdl.handle.net/10453/134520
dc.description.abstract	© 2018 IEEE. Human action recognition from the RGB video is widely applied on varies real applications. Many works have been done by researchers in computer vision and machine learning area to address the challenges and complexity involved in video-based human action recognition. Deep learning approaches including Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) have been introduced in the human action recognition research area. However, due to the drawbacks of the CNNs, recognizing actions with similar gestures and describing complex actions is still very challenging. Hence, an end-to-end hierarchical classification architecture has been proposed in this paper to resolve the confusion between similar gesture. The proposed approach firstly classifies the whole dataset and generates the accuracy for each class in stage 1. Based on the confusion matrix obtained from stage-1, the approach combines the most confused similar gesture pairs into one class, and classify them along with all other class, in the stage-2. In stage 3, similar gesture pairs will be classified by binary classifiers, which will increase the performance of each class and the overall accuracy. We apply and evaluate the developed models to recognize the similar human actions on the both KTH and UCF101 dataset. The result shows that the proposed approach can boost the classification performance on both the datasets. The proposed architecture is robust and any classification technique can be used in stage 1 and stage 2.	en_US
dc.relation.ispartof	International Conference Image and Vision Computing New Zealand	en_US
dc.relation.isbasedon	10.1109/IVCNZ.2018.8634660	en_US
dc.title	An End-to-End Hierarchical Classification Approach for Similar Gesture Recognition	en_US
dc.type	Conference Proceeding
utslib.citation.volume	2018-November	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Strength - QSI - Centre for Quantum Software and Information
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	open_access	*
pubs.publication-status	Published	en_US
pubs.volume	2018-November	en_US

Abstract:

© 2018 IEEE. Human action recognition from the RGB video is widely applied on varies real applications. Many works have been done by researchers in computer vision and machine learning area to address the challenges and complexity involved in video-based human action recognition. Deep learning approaches including Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) have been introduced in the human action recognition research area. However, due to the drawbacks of the CNNs, recognizing actions with similar gestures and describing complex actions is still very challenging. Hence, an end-to-end hierarchical classification architecture has been proposed in this paper to resolve the confusion between similar gesture. The proposed approach firstly classifies the whole dataset and generates the accuracy for each class in stage 1. Based on the confusion matrix obtained from stage-1, the approach combines the most confused similar gesture pairs into one class, and classify them along with all other class, in the stage-2. In stage 3, similar gesture pairs will be classified by binary classifiers, which will increase the performance of each class and the overall accuracy. We apply and evaluate the developed models to recognize the similar human actions on the both KTH and UCF101 dataset. The result shows that the proposed approach can boost the classification performance on both the datasets. The proposed architecture is robust and any classification technique can be used in stage 1 and stage 2.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/134520