P2P proteomics - Data sharing for enhanced protein identification

Schorlemmer, M; Abián, J; Sierra, C; De La Cruz, D; Bernacchioni, L; Jaén, E; Perreau De Pinninck, A; Atencia, M

P2P proteomics - Data sharing for enhanced protein identification

Schorlemmer, M Abián, J Sierra, C

De La Cruz, D Bernacchioni, L Jaén, E Perreau De Pinninck, A Atencia, M

Permalink

Publication Type:: Journal Article
Citation:: Automated Experimentation, 2012, 4 (1)
Issue Date:: 2012-05-10

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published VersionAdobe PDF (3.43 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Schorlemmer, M	en_US
dc.contributor.author	Abián, J	en_US
dc.contributor.author	Sierra, C https://orcid.org/0000-0003-0839-6233	en_US
dc.contributor.author	De La Cruz, D	en_US
dc.contributor.author	Bernacchioni, L	en_US
dc.contributor.author	Jaén, E	en_US
dc.contributor.author	Perreau De Pinninck, A	en_US
dc.contributor.author	Atencia, M	en_US
dc.date.available	2012-01-31	en_US
dc.date.issued	2012-05-10	en_US
dc.identifier.citation	Automated Experimentation, 2012, 4 (1)	en_US
dc.identifier.uri	http://hdl.handle.net/10453/115174
dc.description.abstract	Background: In order to tackle the important and challenging problem in proteomics of identifying known and new protein sequences using high-throughput methods, we propose a data-sharing platform that uses fully distributed P2P technologies to share specifications of peer-interaction protocols and service components. By using such a platform, information to be searched is no longer centralised in a few repositories but gathered from experiments in peer proteomics laboratories, which can subsequently be searched by fellow researchers. Methods. The system distributively runs a data-sharing protocol specified in the Lightweight Communication Calculus underlying the system through which researchers interact via message passing. For this, researchers interact with the system through particular components that link to database querying systems based on BLAST and/or OMSSA and GUI-based visualisation environments. We have tested the proposed platform with data drawn from preexisting MS/MS data reservoirs from the 2006 ABRF (Association of Biomolecular Resource Facilities) test sample, which was extensively tested during the ABRF Proteomics Standards Research Group 2006 worldwide survey. In particular we have taken the data available from a subset of proteomics laboratories of Spain's National Institute for Proteomics, ProteoRed, a network for the coordination, integration and development of the Spanish proteomics facilities. Results and Discussion. We performed queries against nine databases including seven ProteoRed proteomics laboratories, the NCBI Swiss-Prot database and the local database of the CSIC/UAB Proteomics Laboratory. A detailed analysis of the results indicated the presence of a protein that was supported by other NCBI matches and highly scored matches in several proteomics labs. The analysis clearly indicated that the protein was a relatively high concentrated contaminant that could be present in the ABRF sample. This fact is evident from the information that could be derived from the proposed P2P proteomics system, however it is not straightforward to arrive to the same conclusion by conventional means as it is difficult to discard organic contamination of samples. The actual presence of this contaminant was only stated after the ABRF study of all the identifications reported by the laboratories. © 2012 Schorlemmer et al.; licensee BioMed Central Ltd.	en_US
dc.relation.ispartof	Automated Experimentation	en_US
dc.relation.isbasedon	10.1186/1759-4499-4-1	en_US
dc.title	P2P proteomics - Data sharing for enhanced protein identification	en_US
dc.type	Journal Article
utslib.citation.volume	1	en_US
utslib.citation.volume	4	en_US
utslib.for	0806 Information Systems	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Software
utslib.copyright.status	open_access
pubs.issue	1	en_US
pubs.publication-status	Published	en_US
pubs.volume	4	en_US

Abstract:

Background: In order to tackle the important and challenging problem in proteomics of identifying known and new protein sequences using high-throughput methods, we propose a data-sharing platform that uses fully distributed P2P technologies to share specifications of peer-interaction protocols and service components. By using such a platform, information to be searched is no longer centralised in a few repositories but gathered from experiments in peer proteomics laboratories, which can subsequently be searched by fellow researchers. Methods. The system distributively runs a data-sharing protocol specified in the Lightweight Communication Calculus underlying the system through which researchers interact via message passing. For this, researchers interact with the system through particular components that link to database querying systems based on BLAST and/or OMSSA and GUI-based visualisation environments. We have tested the proposed platform with data drawn from preexisting MS/MS data reservoirs from the 2006 ABRF (Association of Biomolecular Resource Facilities) test sample, which was extensively tested during the ABRF Proteomics Standards Research Group 2006 worldwide survey. In particular we have taken the data available from a subset of proteomics laboratories of Spain's National Institute for Proteomics, ProteoRed, a network for the coordination, integration and development of the Spanish proteomics facilities. Results and Discussion. We performed queries against nine databases including seven ProteoRed proteomics laboratories, the NCBI Swiss-Prot database and the local database of the CSIC/UAB Proteomics Laboratory. A detailed analysis of the results indicated the presence of a protein that was supported by other NCBI matches and highly scored matches in several proteomics labs. The analysis clearly indicated that the protein was a relatively high concentrated contaminant that could be present in the ABRF sample. This fact is evident from the information that could be derived from the proposed P2P proteomics system, however it is not straightforward to arrive to the same conclusion by conventional means as it is difficult to discard organic contamination of samples. The actual presence of this contaminant was only stated after the ABRF study of all the identifications reported by the laboratories. © 2012 Schorlemmer et al.; licensee BioMed Central Ltd.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/115174