Identifying direct contacts between protein complex subunits from their conditional dependence in proteomics datasets

Kevin Drew, Christian L. Müller, Richard Bonneau, Edward M. Marcotte

Research output: Contribution to journalArticle

Abstract

Determining the three dimensional arrangement of proteins in a complex is highly beneficial for uncovering mechanistic function and interpreting genetic variation in coding genes comprising protein complexes. There are several methods for determining co-complex interactions between proteins, among them co-fractionation / mass spectrometry (CF-MS), but it remains difficult to identify directly contacting subunits within a multi-protein complex. Correlation analysis of CF-MS profiles shows promise in detecting protein complexes as a whole but is limited in its ability to infer direct physical contacts among proteins in sub-complexes. To identify direct protein-protein contacts within human protein complexes we learn a sparse conditional dependency graph from approximately 3,000 CF-MS experiments on human cell lines. We show substantial performance gains in estimating direct interactions compared to correlation analysis on a benchmark of large protein complexes with solved three-dimensional structures. We demonstrate the method’s value in determining the three dimensional arrangement of proteins by making predictions for complexes without known structure (the exocyst and tRNA multi-synthetase complex) and by establishing evidence for the structural position of a recently discovered component of the core human EKC/KEOPS complex, GON7/C14ORF142, providing a more complete 3D model of the complex. Direct contact prediction provides easily calculable additional structural information for large-scale protein complex mapping studies and should be broadly applicable across organisms as more CF-MS datasets become available.

Original languageEnglish (US)
Article numbere1005625
JournalPLoS Computational Biology
Volume13
Issue number10
DOIs
StatePublished - Oct 1 2017

Fingerprint

proteomics
Proteomics
Protein Subunits
direct contact
Contact
Proteins
Protein
protein
proteins
Fractionation
Mass Spectrometry
fractionation
Mass spectrometry
mass spectrometry
Correlation Analysis
Datasets
Three-dimensional
Arrangement
multiprotein complexes
Amino Acyl-tRNA Synthetases

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Modeling and Simulation
  • Ecology
  • Molecular Biology
  • Genetics
  • Cellular and Molecular Neuroscience
  • Computational Theory and Mathematics

Cite this

Identifying direct contacts between protein complex subunits from their conditional dependence in proteomics datasets. / Drew, Kevin; Müller, Christian L.; Bonneau, Richard; Marcotte, Edward M.

In: PLoS Computational Biology, Vol. 13, No. 10, e1005625, 01.10.2017.

Research output: Contribution to journalArticle

@article{d2fd290884764d8094f38997e365a491,
title = "Identifying direct contacts between protein complex subunits from their conditional dependence in proteomics datasets",
abstract = "Determining the three dimensional arrangement of proteins in a complex is highly beneficial for uncovering mechanistic function and interpreting genetic variation in coding genes comprising protein complexes. There are several methods for determining co-complex interactions between proteins, among them co-fractionation / mass spectrometry (CF-MS), but it remains difficult to identify directly contacting subunits within a multi-protein complex. Correlation analysis of CF-MS profiles shows promise in detecting protein complexes as a whole but is limited in its ability to infer direct physical contacts among proteins in sub-complexes. To identify direct protein-protein contacts within human protein complexes we learn a sparse conditional dependency graph from approximately 3,000 CF-MS experiments on human cell lines. We show substantial performance gains in estimating direct interactions compared to correlation analysis on a benchmark of large protein complexes with solved three-dimensional structures. We demonstrate the method’s value in determining the three dimensional arrangement of proteins by making predictions for complexes without known structure (the exocyst and tRNA multi-synthetase complex) and by establishing evidence for the structural position of a recently discovered component of the core human EKC/KEOPS complex, GON7/C14ORF142, providing a more complete 3D model of the complex. Direct contact prediction provides easily calculable additional structural information for large-scale protein complex mapping studies and should be broadly applicable across organisms as more CF-MS datasets become available.",
author = "Kevin Drew and M{\"u}ller, {Christian L.} and Richard Bonneau and Marcotte, {Edward M.}",
year = "2017",
month = "10",
day = "1",
doi = "10.1371/journal.pcbi.1005625",
language = "English (US)",
volume = "13",
journal = "PLoS Computational Biology",
issn = "1553-734X",
publisher = "Public Library of Science",
number = "10",

}

TY - JOUR

T1 - Identifying direct contacts between protein complex subunits from their conditional dependence in proteomics datasets

AU - Drew, Kevin

AU - Müller, Christian L.

AU - Bonneau, Richard

AU - Marcotte, Edward M.

PY - 2017/10/1

Y1 - 2017/10/1

N2 - Determining the three dimensional arrangement of proteins in a complex is highly beneficial for uncovering mechanistic function and interpreting genetic variation in coding genes comprising protein complexes. There are several methods for determining co-complex interactions between proteins, among them co-fractionation / mass spectrometry (CF-MS), but it remains difficult to identify directly contacting subunits within a multi-protein complex. Correlation analysis of CF-MS profiles shows promise in detecting protein complexes as a whole but is limited in its ability to infer direct physical contacts among proteins in sub-complexes. To identify direct protein-protein contacts within human protein complexes we learn a sparse conditional dependency graph from approximately 3,000 CF-MS experiments on human cell lines. We show substantial performance gains in estimating direct interactions compared to correlation analysis on a benchmark of large protein complexes with solved three-dimensional structures. We demonstrate the method’s value in determining the three dimensional arrangement of proteins by making predictions for complexes without known structure (the exocyst and tRNA multi-synthetase complex) and by establishing evidence for the structural position of a recently discovered component of the core human EKC/KEOPS complex, GON7/C14ORF142, providing a more complete 3D model of the complex. Direct contact prediction provides easily calculable additional structural information for large-scale protein complex mapping studies and should be broadly applicable across organisms as more CF-MS datasets become available.

AB - Determining the three dimensional arrangement of proteins in a complex is highly beneficial for uncovering mechanistic function and interpreting genetic variation in coding genes comprising protein complexes. There are several methods for determining co-complex interactions between proteins, among them co-fractionation / mass spectrometry (CF-MS), but it remains difficult to identify directly contacting subunits within a multi-protein complex. Correlation analysis of CF-MS profiles shows promise in detecting protein complexes as a whole but is limited in its ability to infer direct physical contacts among proteins in sub-complexes. To identify direct protein-protein contacts within human protein complexes we learn a sparse conditional dependency graph from approximately 3,000 CF-MS experiments on human cell lines. We show substantial performance gains in estimating direct interactions compared to correlation analysis on a benchmark of large protein complexes with solved three-dimensional structures. We demonstrate the method’s value in determining the three dimensional arrangement of proteins by making predictions for complexes without known structure (the exocyst and tRNA multi-synthetase complex) and by establishing evidence for the structural position of a recently discovered component of the core human EKC/KEOPS complex, GON7/C14ORF142, providing a more complete 3D model of the complex. Direct contact prediction provides easily calculable additional structural information for large-scale protein complex mapping studies and should be broadly applicable across organisms as more CF-MS datasets become available.

UR - http://www.scopus.com/inward/record.url?scp=85031916838&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85031916838&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1005625

DO - 10.1371/journal.pcbi.1005625

M3 - Article

VL - 13

JO - PLoS Computational Biology

JF - PLoS Computational Biology

SN - 1553-734X

IS - 10

M1 - e1005625

ER -