DDI-PULearn: A positive-unlabeled learning method for large-scale prediction of drug-drug interactions

Zheng, Y; Peng, H; Zhang, X; Zhao, Z; Gao, X; Li, J

DDI-PULearn: A positive-unlabeled learning method for large-scale prediction of drug-drug interactions

Zheng, Y Peng, H

Zhang, X

Zhao, Z

Gao, X Li, J

Permalink

Publication Type:: Journal Article
Citation:: BMC Bioinformatics, 2019, 20
Issue Date:: 2019-12-24

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published VersionAdobe PDF (2.09 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Zheng, Y	en_US
dc.contributor.author	Peng, H https://orcid.org/0000-0002-4379-8097	en_US
dc.contributor.author	Zhang, X https://orcid.org/0000-0002-3783-6560	en_US
dc.contributor.author	Zhao, Z https://orcid.org/0000-0001-5544-4504	en_US
dc.contributor.author	Gao, X	en_US
dc.contributor.author	Li, J https://orcid.org/0000-0003-1833-7413	en_US
dc.date.available	2019-11-12	en_US
dc.date.issued	2019-12-24	en_US
dc.identifier.citation	BMC Bioinformatics, 2019, 20	en_US
dc.identifier.uri	http://hdl.handle.net/10453/138343
dc.description.abstract	© 2019 The Author(s). Background: Drug-drug interactions (DDIs) are a major concern in patients' medication. It's unfeasible to identify all potential DDIs using experimental methods which are time-consuming and expensive. Computational methods provide an effective strategy, however, facing challenges due to the lack of experimentally verified negative samples. Results: To address this problem, we propose a novel positive-unlabeled learning method named DDI-PULearn for large-scale drug-drug-interaction predictions. DDI-PULearn first generates seeds of reliable negatives via OCSVM (one-class support vector machine) under a high-recall constraint and via the cosine-similarity based KNN (k-nearest neighbors) as well. Then trained with all the labeled positives (i.e., the validated DDIs) and the generated seed negatives, DDI-PULearn employs an iterative SVM to identify a set of entire reliable negatives from the unlabeled samples (i.e., the unobserved DDIs). Following that, DDI-PULearn represents all the labeled positives and the identified negatives as vectors of abundant drug properties by a similarity-based method. Finally, DDI-PULearn transforms these vectors into a lower-dimensional space via PCA (principal component analysis) and utilizes the compressed vectors as input for binary classifications. The performance of DDI-PULearn is evaluated on simulative prediction for 149,878 possible interactions between 548 drugs, comparing with two baseline methods and five state-of-the-art methods. Related experiment results show that the proposed method for the representation of DDIs characterizes them accurately. DDI-PULearn achieves superior performance owing to the identified reliable negatives, outperforming all other methods significantly. In addition, the predicted novel DDIs suggest that DDI-PULearn is capable to identify novel DDIs. Conclusions: The results demonstrate that positive-unlabeled learning paves a new way to tackle the problem caused by the lack of experimentally verified negatives in the computational prediction of DDIs.	en_US
dc.relation.ispartof	BMC Bioinformatics	en_US
dc.relation.isbasedon	10.1186/s12859-019-3214-6	en_US
dc.subject.classification	Bioinformatics	en_US
dc.subject.mesh	Humans	en_US
dc.subject.mesh	Cluster Analysis	en_US
dc.subject.mesh	Drug Interactions	en_US
dc.subject.mesh	Support Vector Machine	en_US
dc.title	DDI-PULearn: A positive-unlabeled learning method for large-scale prediction of drug-drug interactions	en_US
dc.type	Journal Article
utslib.citation.volume	20	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	1115 Pharmacology and Pharmaceutical Sciences	en_US
utslib.for	01 Mathematical Sciences	en_US
utslib.for	06 Biological Sciences	en_US
utslib.for	08 Information and Computing Sciences	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
pubs.organisational-group	/University of Technology Sydney/Strength - CHT - Health Technologies
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	open_access
pubs.publication-status	Published	en_US
pubs.volume	20	en_US

Abstract:

© 2019 The Author(s). Background: Drug-drug interactions (DDIs) are a major concern in patients' medication. It's unfeasible to identify all potential DDIs using experimental methods which are time-consuming and expensive. Computational methods provide an effective strategy, however, facing challenges due to the lack of experimentally verified negative samples. Results: To address this problem, we propose a novel positive-unlabeled learning method named DDI-PULearn for large-scale drug-drug-interaction predictions. DDI-PULearn first generates seeds of reliable negatives via OCSVM (one-class support vector machine) under a high-recall constraint and via the cosine-similarity based KNN (k-nearest neighbors) as well. Then trained with all the labeled positives (i.e., the validated DDIs) and the generated seed negatives, DDI-PULearn employs an iterative SVM to identify a set of entire reliable negatives from the unlabeled samples (i.e., the unobserved DDIs). Following that, DDI-PULearn represents all the labeled positives and the identified negatives as vectors of abundant drug properties by a similarity-based method. Finally, DDI-PULearn transforms these vectors into a lower-dimensional space via PCA (principal component analysis) and utilizes the compressed vectors as input for binary classifications. The performance of DDI-PULearn is evaluated on simulative prediction for 149,878 possible interactions between 548 drugs, comparing with two baseline methods and five state-of-the-art methods. Related experiment results show that the proposed method for the representation of DDIs characterizes them accurately. DDI-PULearn achieves superior performance owing to the identified reliable negatives, outperforming all other methods significantly. In addition, the predicted novel DDIs suggest that DDI-PULearn is capable to identify novel DDIs. Conclusions: The results demonstrate that positive-unlabeled learning paves a new way to tackle the problem caused by the lack of experimentally verified negatives in the computational prediction of DDIs.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/138343