Statistical approach to protein quantification

Sarah Gerster, Taejoon Kwon, Christina Ludwig, Mariette Matondo, Christine Vogel, Edward M. Marcotte, Ruedi Aebersold, Peter Buhlmann

Research output: Contribution to journalArticle

Abstract

A major goal in proteomics is the comprehensive and accurate description of a proteome. This task includes not only the identification of proteins in a sample, but also the accurate quantification of their abundance. Although mass spectrometry typically provides information on peptide identity and abundance in a sample, it does not directly measure the concentration of the corresponding proteins. Specifically, most mass-spectrometry-based approaches (e.g. shotgun proteomics or selected reaction monitoring) allow one to quantify peptides using chromatographic peak intensities or spectral counting information. Ultimately, based on these measurements, one wants to infer the concentrations of the corresponding proteins. Inferring properties of the proteins based on experimental peptide evidence is often a complex problem because of the ambiguity of peptide assignments and different chemical properties of the peptides that affect the observed concentrations. We present SCAMPI, a novel generic and statistically sound framework for computing protein abundance scores based on quantified peptides. In contrast to most previous approaches, our model explicitly includes information from shared peptides to improve protein quantitation, especially in eukaryotes with many homologous sequences. The model accounts for uncertainty in the input data, leading to statistical prediction intervals for the protein scores. Furthermore, peptides with extreme abundances can be reassessed and classified as either regular data points or actual outliers. We used the proposed model with several datasets and compared its performance to that of other, previously used approaches for protein quantification in bottom-up mass spectrometry.

Original languageEnglish (US)
Pages (from-to)666-677
Number of pages12
JournalMolecular and Cellular Proteomics
Volume13
Issue number2
DOIs
StatePublished - Feb 2014

Fingerprint

Peptides
Proteins
Mass spectrometry
Mass Spectrometry
Proteomics
Firearms
Proteome
Sequence Homology
Eukaryota
Chemical properties
Uncertainty
Acoustic waves
Monitoring

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Analytical Chemistry

Cite this

Gerster, S., Kwon, T., Ludwig, C., Matondo, M., Vogel, C., Marcotte, E. M., ... Buhlmann, P. (2014). Statistical approach to protein quantification. Molecular and Cellular Proteomics, 13(2), 666-677. https://doi.org/10.1074/mcp.M112.025445

Statistical approach to protein quantification. / Gerster, Sarah; Kwon, Taejoon; Ludwig, Christina; Matondo, Mariette; Vogel, Christine; Marcotte, Edward M.; Aebersold, Ruedi; Buhlmann, Peter.

In: Molecular and Cellular Proteomics, Vol. 13, No. 2, 02.2014, p. 666-677.

Research output: Contribution to journalArticle

Gerster, S, Kwon, T, Ludwig, C, Matondo, M, Vogel, C, Marcotte, EM, Aebersold, R & Buhlmann, P 2014, 'Statistical approach to protein quantification', Molecular and Cellular Proteomics, vol. 13, no. 2, pp. 666-677. https://doi.org/10.1074/mcp.M112.025445
Gerster S, Kwon T, Ludwig C, Matondo M, Vogel C, Marcotte EM et al. Statistical approach to protein quantification. Molecular and Cellular Proteomics. 2014 Feb;13(2):666-677. https://doi.org/10.1074/mcp.M112.025445
Gerster, Sarah ; Kwon, Taejoon ; Ludwig, Christina ; Matondo, Mariette ; Vogel, Christine ; Marcotte, Edward M. ; Aebersold, Ruedi ; Buhlmann, Peter. / Statistical approach to protein quantification. In: Molecular and Cellular Proteomics. 2014 ; Vol. 13, No. 2. pp. 666-677.
@article{e44abeff577e4314b8576e0ff4168e23,
title = "Statistical approach to protein quantification",
abstract = "A major goal in proteomics is the comprehensive and accurate description of a proteome. This task includes not only the identification of proteins in a sample, but also the accurate quantification of their abundance. Although mass spectrometry typically provides information on peptide identity and abundance in a sample, it does not directly measure the concentration of the corresponding proteins. Specifically, most mass-spectrometry-based approaches (e.g. shotgun proteomics or selected reaction monitoring) allow one to quantify peptides using chromatographic peak intensities or spectral counting information. Ultimately, based on these measurements, one wants to infer the concentrations of the corresponding proteins. Inferring properties of the proteins based on experimental peptide evidence is often a complex problem because of the ambiguity of peptide assignments and different chemical properties of the peptides that affect the observed concentrations. We present SCAMPI, a novel generic and statistically sound framework for computing protein abundance scores based on quantified peptides. In contrast to most previous approaches, our model explicitly includes information from shared peptides to improve protein quantitation, especially in eukaryotes with many homologous sequences. The model accounts for uncertainty in the input data, leading to statistical prediction intervals for the protein scores. Furthermore, peptides with extreme abundances can be reassessed and classified as either regular data points or actual outliers. We used the proposed model with several datasets and compared its performance to that of other, previously used approaches for protein quantification in bottom-up mass spectrometry.",
author = "Sarah Gerster and Taejoon Kwon and Christina Ludwig and Mariette Matondo and Christine Vogel and Marcotte, {Edward M.} and Ruedi Aebersold and Peter Buhlmann",
year = "2014",
month = "2",
doi = "10.1074/mcp.M112.025445",
language = "English (US)",
volume = "13",
pages = "666--677",
journal = "Molecular and Cellular Proteomics",
issn = "1535-9476",
publisher = "American Society for Biochemistry and Molecular Biology Inc.",
number = "2",

}

TY - JOUR

T1 - Statistical approach to protein quantification

AU - Gerster, Sarah

AU - Kwon, Taejoon

AU - Ludwig, Christina

AU - Matondo, Mariette

AU - Vogel, Christine

AU - Marcotte, Edward M.

AU - Aebersold, Ruedi

AU - Buhlmann, Peter

PY - 2014/2

Y1 - 2014/2

N2 - A major goal in proteomics is the comprehensive and accurate description of a proteome. This task includes not only the identification of proteins in a sample, but also the accurate quantification of their abundance. Although mass spectrometry typically provides information on peptide identity and abundance in a sample, it does not directly measure the concentration of the corresponding proteins. Specifically, most mass-spectrometry-based approaches (e.g. shotgun proteomics or selected reaction monitoring) allow one to quantify peptides using chromatographic peak intensities or spectral counting information. Ultimately, based on these measurements, one wants to infer the concentrations of the corresponding proteins. Inferring properties of the proteins based on experimental peptide evidence is often a complex problem because of the ambiguity of peptide assignments and different chemical properties of the peptides that affect the observed concentrations. We present SCAMPI, a novel generic and statistically sound framework for computing protein abundance scores based on quantified peptides. In contrast to most previous approaches, our model explicitly includes information from shared peptides to improve protein quantitation, especially in eukaryotes with many homologous sequences. The model accounts for uncertainty in the input data, leading to statistical prediction intervals for the protein scores. Furthermore, peptides with extreme abundances can be reassessed and classified as either regular data points or actual outliers. We used the proposed model with several datasets and compared its performance to that of other, previously used approaches for protein quantification in bottom-up mass spectrometry.

AB - A major goal in proteomics is the comprehensive and accurate description of a proteome. This task includes not only the identification of proteins in a sample, but also the accurate quantification of their abundance. Although mass spectrometry typically provides information on peptide identity and abundance in a sample, it does not directly measure the concentration of the corresponding proteins. Specifically, most mass-spectrometry-based approaches (e.g. shotgun proteomics or selected reaction monitoring) allow one to quantify peptides using chromatographic peak intensities or spectral counting information. Ultimately, based on these measurements, one wants to infer the concentrations of the corresponding proteins. Inferring properties of the proteins based on experimental peptide evidence is often a complex problem because of the ambiguity of peptide assignments and different chemical properties of the peptides that affect the observed concentrations. We present SCAMPI, a novel generic and statistically sound framework for computing protein abundance scores based on quantified peptides. In contrast to most previous approaches, our model explicitly includes information from shared peptides to improve protein quantitation, especially in eukaryotes with many homologous sequences. The model accounts for uncertainty in the input data, leading to statistical prediction intervals for the protein scores. Furthermore, peptides with extreme abundances can be reassessed and classified as either regular data points or actual outliers. We used the proposed model with several datasets and compared its performance to that of other, previously used approaches for protein quantification in bottom-up mass spectrometry.

UR - http://www.scopus.com/inward/record.url?scp=84893303320&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893303320&partnerID=8YFLogxK

U2 - 10.1074/mcp.M112.025445

DO - 10.1074/mcp.M112.025445

M3 - Article

VL - 13

SP - 666

EP - 677

JO - Molecular and Cellular Proteomics

JF - Molecular and Cellular Proteomics

SN - 1535-9476

IS - 2

ER -