Integrating shotgun proteomics and mRNA expression data to improve protein identification

Smriti R. Ramakrishnan, Christine Vogel, John T. Prince, Zhihua Li, Luiz O. Penalva, Margaret Myers, Edward M. Marcotte, Daniel P. Miranker, Rong Wang

Research output: Contribution to journalArticle

Abstract

Motivation: Tandem mass spectrometry (MS/MS) offers fast and reliable characterization of complex protein mixtures, but suffers from low sensitivity in protein identification. In a typical shotgun proteomics experiment, it is assumed that all proteins are equally likely to be present. However, there is often other information available, e.g. the probability of a protein's presence is likely to correlate with its mRNA concentration. Results: We develop a Bayesian score that estimates the posterior probability of a protein's presence in the sample given its identification in an MS/MS experiment and its mRNA concentration measured under similar experimental conditions. Our method, MSpresso, substantially increases the number of proteins identified in an MS/MS experiment at the same error rate, e.g. in yeast, MSpresso increases the number of proteins identified by ∼40%. We apply MSpresso to data from different MS/MS instruments, experimental conditions and organisms (Escherichia coli, human), and predict 19-63% more proteins across the different datasets. MSpresso demonstrates that incorporating prior knowledge of protein presence into shotgun proteomics experiments can substantially improve protein identification scores.

Original languageEnglish (US)
Pages (from-to)1397-1403
Number of pages7
JournalBioinformatics
Volume25
Issue number11
DOIs
StatePublished - Jun 2009

Fingerprint

Proteomics
Firearms
Messenger RNA
Proteins
Protein
Experiment
Experiments
Equally likely
Posterior Probability
Mass Spectrometry
Tandem Mass Spectrometry
Complex Mixtures
Prior Knowledge
Yeast
Correlate
Escherichia coli
Escherichia Coli
Mass spectrometry
Error Rate
Yeasts

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability

Cite this

Ramakrishnan, S. R., Vogel, C., Prince, J. T., Li, Z., Penalva, L. O., Myers, M., ... Wang, R. (2009). Integrating shotgun proteomics and mRNA expression data to improve protein identification. Bioinformatics, 25(11), 1397-1403. https://doi.org/10.1093/bioinformatics/btp168

Integrating shotgun proteomics and mRNA expression data to improve protein identification. / Ramakrishnan, Smriti R.; Vogel, Christine; Prince, John T.; Li, Zhihua; Penalva, Luiz O.; Myers, Margaret; Marcotte, Edward M.; Miranker, Daniel P.; Wang, Rong.

In: Bioinformatics, Vol. 25, No. 11, 06.2009, p. 1397-1403.

Research output: Contribution to journalArticle

Ramakrishnan, SR, Vogel, C, Prince, JT, Li, Z, Penalva, LO, Myers, M, Marcotte, EM, Miranker, DP & Wang, R 2009, 'Integrating shotgun proteomics and mRNA expression data to improve protein identification', Bioinformatics, vol. 25, no. 11, pp. 1397-1403. https://doi.org/10.1093/bioinformatics/btp168
Ramakrishnan, Smriti R. ; Vogel, Christine ; Prince, John T. ; Li, Zhihua ; Penalva, Luiz O. ; Myers, Margaret ; Marcotte, Edward M. ; Miranker, Daniel P. ; Wang, Rong. / Integrating shotgun proteomics and mRNA expression data to improve protein identification. In: Bioinformatics. 2009 ; Vol. 25, No. 11. pp. 1397-1403.
@article{134c0575dff44f648ddd4e7eed654729,
title = "Integrating shotgun proteomics and mRNA expression data to improve protein identification",
abstract = "Motivation: Tandem mass spectrometry (MS/MS) offers fast and reliable characterization of complex protein mixtures, but suffers from low sensitivity in protein identification. In a typical shotgun proteomics experiment, it is assumed that all proteins are equally likely to be present. However, there is often other information available, e.g. the probability of a protein's presence is likely to correlate with its mRNA concentration. Results: We develop a Bayesian score that estimates the posterior probability of a protein's presence in the sample given its identification in an MS/MS experiment and its mRNA concentration measured under similar experimental conditions. Our method, MSpresso, substantially increases the number of proteins identified in an MS/MS experiment at the same error rate, e.g. in yeast, MSpresso increases the number of proteins identified by ∼40{\%}. We apply MSpresso to data from different MS/MS instruments, experimental conditions and organisms (Escherichia coli, human), and predict 19-63{\%} more proteins across the different datasets. MSpresso demonstrates that incorporating prior knowledge of protein presence into shotgun proteomics experiments can substantially improve protein identification scores.",
author = "Ramakrishnan, {Smriti R.} and Christine Vogel and Prince, {John T.} and Zhihua Li and Penalva, {Luiz O.} and Margaret Myers and Marcotte, {Edward M.} and Miranker, {Daniel P.} and Rong Wang",
year = "2009",
month = "6",
doi = "10.1093/bioinformatics/btp168",
language = "English (US)",
volume = "25",
pages = "1397--1403",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "11",

}

TY - JOUR

T1 - Integrating shotgun proteomics and mRNA expression data to improve protein identification

AU - Ramakrishnan, Smriti R.

AU - Vogel, Christine

AU - Prince, John T.

AU - Li, Zhihua

AU - Penalva, Luiz O.

AU - Myers, Margaret

AU - Marcotte, Edward M.

AU - Miranker, Daniel P.

AU - Wang, Rong

PY - 2009/6

Y1 - 2009/6

N2 - Motivation: Tandem mass spectrometry (MS/MS) offers fast and reliable characterization of complex protein mixtures, but suffers from low sensitivity in protein identification. In a typical shotgun proteomics experiment, it is assumed that all proteins are equally likely to be present. However, there is often other information available, e.g. the probability of a protein's presence is likely to correlate with its mRNA concentration. Results: We develop a Bayesian score that estimates the posterior probability of a protein's presence in the sample given its identification in an MS/MS experiment and its mRNA concentration measured under similar experimental conditions. Our method, MSpresso, substantially increases the number of proteins identified in an MS/MS experiment at the same error rate, e.g. in yeast, MSpresso increases the number of proteins identified by ∼40%. We apply MSpresso to data from different MS/MS instruments, experimental conditions and organisms (Escherichia coli, human), and predict 19-63% more proteins across the different datasets. MSpresso demonstrates that incorporating prior knowledge of protein presence into shotgun proteomics experiments can substantially improve protein identification scores.

AB - Motivation: Tandem mass spectrometry (MS/MS) offers fast and reliable characterization of complex protein mixtures, but suffers from low sensitivity in protein identification. In a typical shotgun proteomics experiment, it is assumed that all proteins are equally likely to be present. However, there is often other information available, e.g. the probability of a protein's presence is likely to correlate with its mRNA concentration. Results: We develop a Bayesian score that estimates the posterior probability of a protein's presence in the sample given its identification in an MS/MS experiment and its mRNA concentration measured under similar experimental conditions. Our method, MSpresso, substantially increases the number of proteins identified in an MS/MS experiment at the same error rate, e.g. in yeast, MSpresso increases the number of proteins identified by ∼40%. We apply MSpresso to data from different MS/MS instruments, experimental conditions and organisms (Escherichia coli, human), and predict 19-63% more proteins across the different datasets. MSpresso demonstrates that incorporating prior knowledge of protein presence into shotgun proteomics experiments can substantially improve protein identification scores.

UR - http://www.scopus.com/inward/record.url?scp=65649152557&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=65649152557&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btp168

DO - 10.1093/bioinformatics/btp168

M3 - Article

C2 - 19318424

AN - SCOPUS:65649152557

VL - 25

SP - 1397

EP - 1403

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 11

ER -