Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities.

Goh, WWB; Kong, W; Wong, L

Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities.

Goh, WWB Kong, W Wong, L

Permalink

Publisher:: World Scientific Publishing
Publication Type:: Journal Article
Citation:: J Bioinform Comput Biol, 2023, 21, (1), pp. 2350005
Issue Date:: 2023-02

Closed Access

	Filename	Description	Size
	22284249_13340555930005671.pdf	Published version	440.51 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Goh, WWB
dc.contributor.author	Kong, W
dc.contributor.author	Wong, L
dc.date.accessioned	2024-03-24T02:49:24Z
dc.date.available	2024-03-24T02:49:24Z
dc.date.issued	2023-02
dc.identifier.citation	J Bioinform Comput Biol, 2023, 21, (1), pp. 2350005
dc.identifier.issn	0219-7200
dc.identifier.issn	1757-6334
dc.identifier.uri	http://hdl.handle.net/10453/177036
dc.description.abstract	Some prediction methods use probability to rank their predictions, while some other prediction methods do not rank their predictions and instead use [Formula: see text]-values to support their predictions. This disparity renders direct cross-comparison of these two kinds of methods difficult. In particular, approaches such as the Bayes Factor upper Bound (BFB) for [Formula: see text]-value conversion may not make correct assumptions for this kind of cross-comparisons. Here, using a well-established case study on renal cancer proteomics and in the context of missing protein prediction, we demonstrate how to compare these two kinds of prediction methods using two different strategies. The first strategy is based on false discovery rate (FDR) estimation, which does not make the same naïve assumptions as BFB conversions. The second strategy is a powerful approach which we colloquially call "home ground testing". Both strategies perform better than BFB conversions. Thus, we recommend comparing prediction methods by standardization to a common performance benchmark such as a global FDR. And where this is not possible, we recommend reciprocal "home ground testing".
dc.format	Print-Electronic
dc.language	eng
dc.publisher	World Scientific Publishing
dc.relation.ispartof	J Bioinform Comput Biol
dc.relation.isbasedon	10.1142/S0219720023500051
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0601 Biochemistry and Cell Biology, 0801 Artificial Intelligence and Image Processing
dc.subject.classification	Bioinformatics
dc.subject.classification	3102 Bioinformatics and computational biology
dc.subject.classification	4601 Applied computing
dc.subject.mesh	Bayes Theorem
dc.subject.mesh	Probability
dc.subject.mesh	Proteomics
dc.subject.mesh	Proteins
dc.subject.mesh	Proteins
dc.subject.mesh	Probability
dc.subject.mesh	Bayes Theorem
dc.subject.mesh	Proteomics
dc.subject.mesh	Bayes Theorem
dc.subject.mesh	Probability
dc.subject.mesh	Proteomics
dc.subject.mesh	Proteins
dc.title	Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities.
dc.type	Journal Article
utslib.citation.volume	21
utslib.location.activity	Singapore
utslib.for	0601 Biochemistry and Cell Biology
utslib.for	0801 Artificial Intelligence and Image Processing
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access	*
dc.date.updated	2024-03-24T02:49:23Z
pubs.issue	1
pubs.publication-status	Published
pubs.volume	21
utslib.citation.issue	1

Abstract:

Some prediction methods use probability to rank their predictions, while some other prediction methods do not rank their predictions and instead use [Formula: see text]-values to support their predictions. This disparity renders direct cross-comparison of these two kinds of methods difficult. In particular, approaches such as the Bayes Factor upper Bound (BFB) for [Formula: see text]-value conversion may not make correct assumptions for this kind of cross-comparisons. Here, using a well-established case study on renal cancer proteomics and in the context of missing protein prediction, we demonstrate how to compare these two kinds of prediction methods using two different strategies. The first strategy is based on false discovery rate (FDR) estimation, which does not make the same naïve assumptions as BFB conversions. The second strategy is a powerful approach which we colloquially call "home ground testing". Both strategies perform better than BFB conversions. Thus, we recommend comparing prediction methods by standardization to a common performance benchmark such as a global FDR. And where this is not possible, we recommend reciprocal "home ground testing".

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/177036