PTMscape: An open source tool to predict generic post-translational modifications and map modification crosstalk in protein domains and biological processes

Ginny X.H. Li, Christine Vogel, Hyungwon Choi

Research output: Contribution to journalArticle

Abstract

While tandem mass spectrometry can detect post-translational modifications (PTM) at the proteome scale, reported PTM sites are often incomplete and include false positives. Computational approaches can complement these datasets by additional predictions, but most available tools use prediction models pre-trained for single PTM type by the developers and it remains a difficult task to perform large-scale batch prediction for multiple PTMs with flexible user control, including the choice of training data. We developed an R package called PTMscape which predicts PTM sites across the proteome based on a unified and comprehensive set of descriptors of the physico-chemical microenvironment of modified sites, with additional downstream analysis modules to test enrichment of individual or pairs of PTMs in protein domains. PTMscape is flexible in the ability to process any major modifications, such as phosphorylation and ubiquitination, while achieving the sensitivity and specificity comparable to single-PTM methods and outperforming other multi-PTM tools. Applying this framework, we expanded proteome-wide coverage of five major PTMs affecting different residues by prediction, especially for lysine and arginine modifications. Using a combination of experimentally acquired sites (PSP) and newly predicted sites, we discovered that the crosstalk among multiple PTMs occur more frequently than by random chance in key protein domains such as histone, protein kinase, and RNA recognition motifs, spanning various biological processes such as RNA processing, DNA damage response, signal transduction, and regulation of cell cycle. These results provide a proteome-scale analysis of crosstalk among major PTMs and can be easily extended to other types of PTM.

Original languageEnglish (US)
Pages (from-to)197-209
Number of pages13
JournalMolecular Omics
Volume14
Issue number3
DOIs
StatePublished - Jan 1 2018

Fingerprint

Pulse time modulation
Biological Phenomena
Post Translational Protein Processing
Crosstalk
Proteome
Proteins
RNA
Signal transduction
Phosphorylation
Protamine Kinase
Histones
Protein Kinases
Lysine
Mass spectrometry
Arginine
Ubiquitination
Cells
Tandem Mass Spectrometry
Protein Domains
DNA Damage

ASJC Scopus subject areas

  • Biochemistry
  • Genetics
  • Molecular Biology

Cite this

PTMscape : An open source tool to predict generic post-translational modifications and map modification crosstalk in protein domains and biological processes. / Li, Ginny X.H.; Vogel, Christine; Choi, Hyungwon.

In: Molecular Omics, Vol. 14, No. 3, 01.01.2018, p. 197-209.

Research output: Contribution to journalArticle

@article{f3ee8b28dc3547a08c6518bea1673277,
title = "PTMscape: An open source tool to predict generic post-translational modifications and map modification crosstalk in protein domains and biological processes",
abstract = "While tandem mass spectrometry can detect post-translational modifications (PTM) at the proteome scale, reported PTM sites are often incomplete and include false positives. Computational approaches can complement these datasets by additional predictions, but most available tools use prediction models pre-trained for single PTM type by the developers and it remains a difficult task to perform large-scale batch prediction for multiple PTMs with flexible user control, including the choice of training data. We developed an R package called PTMscape which predicts PTM sites across the proteome based on a unified and comprehensive set of descriptors of the physico-chemical microenvironment of modified sites, with additional downstream analysis modules to test enrichment of individual or pairs of PTMs in protein domains. PTMscape is flexible in the ability to process any major modifications, such as phosphorylation and ubiquitination, while achieving the sensitivity and specificity comparable to single-PTM methods and outperforming other multi-PTM tools. Applying this framework, we expanded proteome-wide coverage of five major PTMs affecting different residues by prediction, especially for lysine and arginine modifications. Using a combination of experimentally acquired sites (PSP) and newly predicted sites, we discovered that the crosstalk among multiple PTMs occur more frequently than by random chance in key protein domains such as histone, protein kinase, and RNA recognition motifs, spanning various biological processes such as RNA processing, DNA damage response, signal transduction, and regulation of cell cycle. These results provide a proteome-scale analysis of crosstalk among major PTMs and can be easily extended to other types of PTM.",
author = "Li, {Ginny X.H.} and Christine Vogel and Hyungwon Choi",
year = "2018",
month = "1",
day = "1",
doi = "10.1039/c8mo00027a",
language = "English (US)",
volume = "14",
pages = "197--209",
journal = "Molecular Omics",
issn = "2515-4184",
publisher = "Royal Society of Chemistry",
number = "3",

}

TY - JOUR

T1 - PTMscape

T2 - An open source tool to predict generic post-translational modifications and map modification crosstalk in protein domains and biological processes

AU - Li, Ginny X.H.

AU - Vogel, Christine

AU - Choi, Hyungwon

PY - 2018/1/1

Y1 - 2018/1/1

N2 - While tandem mass spectrometry can detect post-translational modifications (PTM) at the proteome scale, reported PTM sites are often incomplete and include false positives. Computational approaches can complement these datasets by additional predictions, but most available tools use prediction models pre-trained for single PTM type by the developers and it remains a difficult task to perform large-scale batch prediction for multiple PTMs with flexible user control, including the choice of training data. We developed an R package called PTMscape which predicts PTM sites across the proteome based on a unified and comprehensive set of descriptors of the physico-chemical microenvironment of modified sites, with additional downstream analysis modules to test enrichment of individual or pairs of PTMs in protein domains. PTMscape is flexible in the ability to process any major modifications, such as phosphorylation and ubiquitination, while achieving the sensitivity and specificity comparable to single-PTM methods and outperforming other multi-PTM tools. Applying this framework, we expanded proteome-wide coverage of five major PTMs affecting different residues by prediction, especially for lysine and arginine modifications. Using a combination of experimentally acquired sites (PSP) and newly predicted sites, we discovered that the crosstalk among multiple PTMs occur more frequently than by random chance in key protein domains such as histone, protein kinase, and RNA recognition motifs, spanning various biological processes such as RNA processing, DNA damage response, signal transduction, and regulation of cell cycle. These results provide a proteome-scale analysis of crosstalk among major PTMs and can be easily extended to other types of PTM.

AB - While tandem mass spectrometry can detect post-translational modifications (PTM) at the proteome scale, reported PTM sites are often incomplete and include false positives. Computational approaches can complement these datasets by additional predictions, but most available tools use prediction models pre-trained for single PTM type by the developers and it remains a difficult task to perform large-scale batch prediction for multiple PTMs with flexible user control, including the choice of training data. We developed an R package called PTMscape which predicts PTM sites across the proteome based on a unified and comprehensive set of descriptors of the physico-chemical microenvironment of modified sites, with additional downstream analysis modules to test enrichment of individual or pairs of PTMs in protein domains. PTMscape is flexible in the ability to process any major modifications, such as phosphorylation and ubiquitination, while achieving the sensitivity and specificity comparable to single-PTM methods and outperforming other multi-PTM tools. Applying this framework, we expanded proteome-wide coverage of five major PTMs affecting different residues by prediction, especially for lysine and arginine modifications. Using a combination of experimentally acquired sites (PSP) and newly predicted sites, we discovered that the crosstalk among multiple PTMs occur more frequently than by random chance in key protein domains such as histone, protein kinase, and RNA recognition motifs, spanning various biological processes such as RNA processing, DNA damage response, signal transduction, and regulation of cell cycle. These results provide a proteome-scale analysis of crosstalk among major PTMs and can be easily extended to other types of PTM.

UR - http://www.scopus.com/inward/record.url?scp=85057972589&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057972589&partnerID=8YFLogxK

U2 - 10.1039/c8mo00027a

DO - 10.1039/c8mo00027a

M3 - Article

C2 - 29876573

AN - SCOPUS:85057972589

VL - 14

SP - 197

EP - 209

JO - Molecular Omics

JF - Molecular Omics

SN - 2515-4184

IS - 3

ER -