Computational identification of protein pupylation sites by using profile-based composition of k-spaced amino acid pairs

Hasan, MM; Zhou, Y; Lu, X; Li, J; Song, J; Zhang, Z

Computational identification of protein pupylation sites by using profile-based composition of k-spaced amino acid pairs

Hasan, MM Zhou, Y Lu, X Li, J

Song, J Zhang, Z

Permalink

Publication Type:: Journal Article
Citation:: PLoS ONE, 2015, 10 (6)
Issue Date:: 2015-06-16

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published VersionAdobe PDF (3.05 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Hasan, MM	en_US
dc.contributor.author	Zhou, Y	en_US
dc.contributor.author	Lu, X	en_US
dc.contributor.author	Li, J https://orcid.org/0000-0003-1833-7413	en_US
dc.contributor.author	Song, J	en_US
dc.contributor.author	Zhang, Z	en_US
dc.date.available	2015-05-10	en_US
dc.date.issued	2015-06-16	en_US
dc.identifier.citation	PLoS ONE, 2015, 10 (6)	en_US
dc.identifier.uri	http://hdl.handle.net/10453/117264
dc.description.abstract	© 2015 Hasan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Prokaryotic proteins are regulated by pupylation, a type of post-translational modification that contributes to cellular function in bacterial organisms. In pupylation process, the prokaryotic ubiquitin-like protein (Pup) tagging is functionally analogous to ubiquitination in order to tag target proteins for proteasomal degradation. To date, several experimental methods have been developed to identify pupylated proteins and their pupylation sites, but these experimental methods are generally laborious and costly. Therefore, computational methods that can accurately predict potential pupylation sites based on protein sequence information are highly desirable. In this paper, a novel predictor termed as pbPUP has been developed for accurate prediction of pupylation sites. In particular, a sophisticated sequence encoding scheme [i.e. the profile-based composition of k-spaced amino acid pairs (pbCKSAAP)] is used to represent the sequence patterns and evolutionary information of the sequence fragments surrounding pupylation sites. Then, a Support Vector Machine (SVM) classifier is trained using the pbCKSAAP encoding scheme. The final pbPUP predictor achieves an AUC value of 0.849 in10-fold cross-validation tests and outperforms other existing predictors on a comprehensive independent test dataset. The proposed method is anticipated to be a helpful computational resource for the prediction of pupylation sites. The web server and curated datasets in this study are freely available at http://protein.cau.edu.cn/pbPUP/.	en_US
dc.relation.ispartof	PLoS ONE	en_US
dc.relation.isbasedon	10.1371/journal.pone.0129635	en_US
dc.subject.classification	General Science & Technology	en_US
dc.subject.mesh	Proteasome Endopeptidase Complex	en_US
dc.subject.mesh	Amino Acids	en_US
dc.subject.mesh	Bacterial Proteins	en_US
dc.subject.mesh	Ubiquitins	en_US
dc.subject.mesh	Reproducibility of Results	en_US
dc.subject.mesh	Computational Biology	en_US
dc.subject.mesh	Protein Processing, Post-Translational	en_US
dc.subject.mesh	Binding Sites	en_US
dc.subject.mesh	Amino Acid Sequence	en_US
dc.subject.mesh	Algorithms	en_US
dc.subject.mesh	Internet	en_US
dc.subject.mesh	Molecular Sequence Data	en_US
dc.subject.mesh	Proteolysis	en_US
dc.subject.mesh	Support Vector Machine	en_US
dc.title	Computational identification of protein pupylation sites by using profile-based composition of k-spaced amino acid pairs	en_US
dc.type	Journal Article
utslib.citation.volume	6	en_US
utslib.citation.volume	10	en_US
utslib.for	010299 Applied Mathematics not elsewhere classified	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
pubs.organisational-group	/University of Technology Sydney/Strength - CHT - Health Technologies
utslib.copyright.status	open_access
pubs.issue	6	en_US
pubs.publication-status	Published	en_US
pubs.volume	10	en_US

Abstract:

© 2015 Hasan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Prokaryotic proteins are regulated by pupylation, a type of post-translational modification that contributes to cellular function in bacterial organisms. In pupylation process, the prokaryotic ubiquitin-like protein (Pup) tagging is functionally analogous to ubiquitination in order to tag target proteins for proteasomal degradation. To date, several experimental methods have been developed to identify pupylated proteins and their pupylation sites, but these experimental methods are generally laborious and costly. Therefore, computational methods that can accurately predict potential pupylation sites based on protein sequence information are highly desirable. In this paper, a novel predictor termed as pbPUP has been developed for accurate prediction of pupylation sites. In particular, a sophisticated sequence encoding scheme [i.e. the profile-based composition of k-spaced amino acid pairs (pbCKSAAP)] is used to represent the sequence patterns and evolutionary information of the sequence fragments surrounding pupylation sites. Then, a Support Vector Machine (SVM) classifier is trained using the pbCKSAAP encoding scheme. The final pbPUP predictor achieves an AUC value of 0.849 in10-fold cross-validation tests and outperforms other existing predictors on a comprehensive independent test dataset. The proposed method is anticipated to be a helpful computational resource for the prediction of pupylation sites. The web server and curated datasets in this study are freely available at http://protein.cau.edu.cn/pbPUP/.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/117264