Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants.

Prakash, A; Taylor, L; Varkey, M; Hoxie, N; Mohammed, Y; Goo, YA; Peterman, S; Moghekar, A; Yuan, Y; Glaros, T; Steele, JR; Faridi, P; Parihari, S; Srivastava, S; Otto, JJ; Nyalwidhe, JO; Semmes, OJ; Moran, MF; Madugundu, A; Mun, DG; Pandey, A; Mahoney, KE; Shabanowitz, J; Saxena, S; Orsburn, BC

Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants.

Prakash, A Taylor, L Varkey, M Hoxie, N Mohammed, Y Goo, YA Peterman, S Moghekar, A Yuan, Y Glaros, T Steele, JR Faridi, P Parihari, S Srivastava, S Otto, JJ Nyalwidhe, JO Semmes, OJ Moran, MF Madugundu, A Mun, DG Pandey, A Mahoney, KE Shabanowitz, J Saxena, S Orsburn, BC

Permalink

Publisher:: MDPI AG
Publication Type:: Journal Article
Citation:: Cancers, 2021, 13, (20), pp. 5034-5034
Issue Date:: 2021-10-09

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published versionAdobe PDF (4.5 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Prakash, A
dc.contributor.author	Taylor, L
dc.contributor.author	Varkey, M
dc.contributor.author	Hoxie, N
dc.contributor.author	Mohammed, Y
dc.contributor.author	Goo, YA
dc.contributor.author	Peterman, S
dc.contributor.author	Moghekar, A
dc.contributor.author	Yuan, Y
dc.contributor.author	Glaros, T
dc.contributor.author	Steele, JR
dc.contributor.author	Faridi, P
dc.contributor.author	Parihari, S
dc.contributor.author	Srivastava, S
dc.contributor.author	Otto, JJ
dc.contributor.author	Nyalwidhe, JO
dc.contributor.author	Semmes, OJ
dc.contributor.author	Moran, MF
dc.contributor.author	Madugundu, A
dc.contributor.author	Mun, DG
dc.contributor.author	Pandey, A
dc.contributor.author	Mahoney, KE
dc.contributor.author	Shabanowitz, J
dc.contributor.author	Saxena, S
dc.contributor.author	Orsburn, BC
dc.date.accessioned	2022-01-04T05:40:10Z
dc.date.available	2021-10-01
dc.date.available	2022-01-04T05:40:10Z
dc.date.issued	2021-10-09
dc.identifier.citation	Cancers, 2021, 13, (20), pp. 5034-5034
dc.identifier.issn	2072-6694
dc.identifier.issn	2072-6694
dc.identifier.uri	http://hdl.handle.net/10453/152665
dc.description.abstract	The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has provided some of the most in-depth analyses of the phenotypes of human tumors ever constructed. Today, the majority of proteomic data analysis is still performed using software housed on desktop computers which limits the number of sequence variants and post-translational modifications that can be considered. The original CPTAC studies limited the search for PTMs to only samples that were chemically enriched for those modified peptides. Similarly, the only sequence variants considered were those with strong evidence at the exon or transcript level. In this multi-institutional collaborative reanalysis, we utilized unbiased protein databases containing millions of human sequence variants in conjunction with hundreds of common post-translational modifications. Using these tools, we identified tens of thousands of high-confidence PTMs and sequence variants. We identified 4132 phosphorylated peptides in nonenriched samples, 93% of which were confirmed in the samples which were chemically enriched for phosphopeptides. In addition, our results also cover 90% of the high-confidence variants reported by the original proteogenomics study, without the need for sample specific next-generation sequencing. Finally, we report fivefold more somatic and germline variants that have an independent evidence at the peptide level, including mutations in ERRB2 and BCAS1. In this reanalysis of CPTAC proteomic data with cloud computing, we present an openly available and searchable web resource of the highest-coverage proteomic profiling of human tumors described to date.
dc.format	Electronic
dc.language	eng
dc.publisher	MDPI AG
dc.relation.ispartof	Cancers
dc.relation.isbasedon	10.3390/cancers13205034
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	1112 Oncology and Carcinogenesis
dc.title	Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants.
dc.type	Journal Article
utslib.citation.volume	13
utslib.location.activity	Switzerland
utslib.for	1112 Oncology and Carcinogenesis
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Provost
pubs.organisational-group	/University of Technology Sydney/Provost/Jumbunna
utslib.copyright.status	open_access	*
dc.date.updated	2022-01-04T05:40:03Z
pubs.issue	20
pubs.publication-status	Published
pubs.volume	13
utslib.citation.issue	20

Abstract:

The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has provided some of the most in-depth analyses of the phenotypes of human tumors ever constructed. Today, the majority of proteomic data analysis is still performed using software housed on desktop computers which limits the number of sequence variants and post-translational modifications that can be considered. The original CPTAC studies limited the search for PTMs to only samples that were chemically enriched for those modified peptides. Similarly, the only sequence variants considered were those with strong evidence at the exon or transcript level. In this multi-institutional collaborative reanalysis, we utilized unbiased protein databases containing millions of human sequence variants in conjunction with hundreds of common post-translational modifications. Using these tools, we identified tens of thousands of high-confidence PTMs and sequence variants. We identified 4132 phosphorylated peptides in nonenriched samples, 93% of which were confirmed in the samples which were chemically enriched for phosphopeptides. In addition, our results also cover 90% of the high-confidence variants reported by the original proteogenomics study, without the need for sample specific next-generation sequencing. Finally, we report fivefold more somatic and germline variants that have an independent evidence at the peptide level, including mutations in ERRB2 and BCAS1. In this reanalysis of CPTAC proteomic data with cloud computing, we present an openly available and searchable web resource of the highest-coverage proteomic profiling of human tumors described to date.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/152665