Margin-based ensemble classifier for protein fold recognition

DSpace/Manakin Repository

Search OPUS


Advanced Search

Browse

My Account

Show simple item record

dc.contributor.author Yang, T
dc.contributor.author Kecman, V
dc.contributor.author Cao, L
dc.contributor.author Zhang, C
dc.contributor.author Zhexue Huang, J
dc.date.accessioned 2012-02-02T04:11:55Z
dc.date.issued 2011-09-15
dc.identifier.citation Expert Systems with Applications, 2011, 38 (10), pp. 12348 - 12355
dc.identifier.issn 0957-4174
dc.identifier.other C1 en_US
dc.identifier.uri http://hdl.handle.net/10453/14498
dc.description.abstract Recognition of protein folding patterns is an important step in protein structure and function predictions. Traditional sequence similarity-based approach fails to yield convincing predictions when proteins have low sequence identities, while the taxonometric approach is a reliable alternative. From a pattern recognition perspective, protein fold recognition involves a large number of classes with only a small number of training samples, and multiple heterogeneous feature groups derived from different propensities of amino acids. This raises the need for a classification method that is able to handle the data complexity with a high prediction accuracy for practical applications. To this end, a novel ensemble classifier, called MarFold, is proposed in this paper which combines three margin-based classifiers for protein fold recognition. The effectiveness of our method is demonstrated with the benchmark D-B dataset with 27 classes. The overall prediction accuracy obtained by MarFold is 71.7%, which surpasses the existing fold recognition methods by 3.1-15.7%. Moreover, one component classifier for MarFold, called ALH, has obtained a prediction accuracy of 65.5%, which is 4.7-9.5% higher than the prediction accuracies for the published methods using single classifiers. Additionally, the feature set of pairwise frequency information about the amino acids, which is adopted by MarFold, is found to be important for discriminating folding patterns. These results imply that the MarFold method and its operation engine ALH might become useful vehicles for protein fold recognition, as well as other bioinformatics tasks. The MarFold method and the datasets can be obtained from: (http://www-staff.it.uts.edu.au/∼lbcao/publication/MarFold.7z). © 2010 Elsevier Ltd. All rights reserved.
dc.language eng
dc.relation.isbasedon 10.1016/j.eswa.2011.04.014
dc.title Margin-based ensemble classifier for protein fold recognition
dc.type Journal Article
dc.parent Expert Systems with Applications
dc.journal.volume 10
dc.journal.volume 38
dc.journal.number 10 en_US
dc.publocation Oxford en_US
dc.identifier.startpage 12348 en_US
dc.identifier.endpage 12355 en_US
dc.cauo.name FEIT.School of Software en_US
dc.conference Verified OK en_US
dc.for 0102 Applied Mathematics
dc.personcode 011221
dc.personcode 034535
dc.personcode 108195
dc.percentage 100 en_US
dc.classification.name Applied Mathematics en_US
dc.classification.type FOR-08 en_US
dc.edition en_US
dc.custom en_US
dc.date.activity en_US
dc.location.activity WOS:000292169500038 en_US
dc.description.keywords Support Vector Machine; Hierarchical Learning Architecture; Prediction en_US
dc.description.keywords Adaptive local hyperplane
dc.description.keywords Amino acid sequence
dc.description.keywords Ensemble classifier
dc.description.keywords Protein fold recognition
dc.description.keywords Support vector machine
pubs.embargo.period Not known
pubs.organisational-group /University of Technology Sydney
pubs.organisational-group /University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group /University of Technology Sydney/Strength - Quantum Computation and Intelligent Systems
utslib.copyright.status Closed Access
utslib.copyright.date 2015-04-15 12:17:09.805752+10
utslib.collection.history Closed (ID: 3)


Files in this item

This item appears in the following Collection(s)

Show simple item record