Contact order and ab initio protein structure prediction

Richard Bonneau, Ingo Ruczinski, Jerry Tsai, David Baker

Research output: Contribution to journalArticle

Abstract

Although much of the motivation for experimental studies of protein folding is to obtain insights for improving protein structure prediction, there has been relatively little connection between experimental protein folding studies and computational structural prediction work in recent years. In the present study, we show that the relationship between protein folding rates and the contact order (CO) of the native structure has implications for ab initio protein structure prediction. Rosetta ab initio folding simulations produce a dearth of high CO structures and an excess of low CO structures, as expected if the computer simulations mimic to some extent the actual folding process. Consistent with this, the majority of failures in ab initio prediction in the CASP4 (critical assessment of structure prediction) experiment involved high CO structures likely to fold much more slowly than the lower CO structures for which reasonable predictions were made. This bias against high CO structures can be partially alleviated by performing large numbers of additional simulations, selecting out the higher CO structures, and eliminating the very low CO structure; this leads to a modest improvement in prediction quality. More significant improvements in predictions for proteins with complex topologies may be possible following significant increases in high-performance computing power, which will be required for thoroughly sampling high CO conformations (high CO proteins can take six orders of magnitude longer to fold than low CO proteins). Importantly for such strategy, simulations performed for high CO structures converge much less strongly than those for low CO structures, and hence, lack of simulation convergence can indicate the need for improved sampling of high CO conformations. The parallels between Rosetta simulations and folding in vivo may extend to misfolding: The very low CO structures that accumulate in Rosetta simulations consist primarily of local up-down β-sheets that may resemble precursors to amyloid formation.

Original languageEnglish (US)
Pages (from-to)1937-1944
Number of pages8
JournalProtein Science
Volume11
Issue number8
DOIs
StatePublished - 2002

Fingerprint

Protein Folding
Protein folding
Proteins
Computing Methodologies
Conformations
Amyloid
Computer Simulation
Sampling
Topology
Computer simulation
Experiments

Keywords

  • Protein folding
  • Rosetta
  • Structure prediction

ASJC Scopus subject areas

  • Biochemistry

Cite this

Contact order and ab initio protein structure prediction. / Bonneau, Richard; Ruczinski, Ingo; Tsai, Jerry; Baker, David.

In: Protein Science, Vol. 11, No. 8, 2002, p. 1937-1944.

Research output: Contribution to journalArticle

Bonneau, R, Ruczinski, I, Tsai, J & Baker, D 2002, 'Contact order and ab initio protein structure prediction', Protein Science, vol. 11, no. 8, pp. 1937-1944. https://doi.org/10.1110/ps.3790102
Bonneau, Richard ; Ruczinski, Ingo ; Tsai, Jerry ; Baker, David. / Contact order and ab initio protein structure prediction. In: Protein Science. 2002 ; Vol. 11, No. 8. pp. 1937-1944.
@article{e560cf6f629a42bfb70151524dbd843a,
title = "Contact order and ab initio protein structure prediction",
abstract = "Although much of the motivation for experimental studies of protein folding is to obtain insights for improving protein structure prediction, there has been relatively little connection between experimental protein folding studies and computational structural prediction work in recent years. In the present study, we show that the relationship between protein folding rates and the contact order (CO) of the native structure has implications for ab initio protein structure prediction. Rosetta ab initio folding simulations produce a dearth of high CO structures and an excess of low CO structures, as expected if the computer simulations mimic to some extent the actual folding process. Consistent with this, the majority of failures in ab initio prediction in the CASP4 (critical assessment of structure prediction) experiment involved high CO structures likely to fold much more slowly than the lower CO structures for which reasonable predictions were made. This bias against high CO structures can be partially alleviated by performing large numbers of additional simulations, selecting out the higher CO structures, and eliminating the very low CO structure; this leads to a modest improvement in prediction quality. More significant improvements in predictions for proteins with complex topologies may be possible following significant increases in high-performance computing power, which will be required for thoroughly sampling high CO conformations (high CO proteins can take six orders of magnitude longer to fold than low CO proteins). Importantly for such strategy, simulations performed for high CO structures converge much less strongly than those for low CO structures, and hence, lack of simulation convergence can indicate the need for improved sampling of high CO conformations. The parallels between Rosetta simulations and folding in vivo may extend to misfolding: The very low CO structures that accumulate in Rosetta simulations consist primarily of local up-down β-sheets that may resemble precursors to amyloid formation.",
keywords = "Protein folding, Rosetta, Structure prediction",
author = "Richard Bonneau and Ingo Ruczinski and Jerry Tsai and David Baker",
year = "2002",
doi = "10.1110/ps.3790102",
language = "English (US)",
volume = "11",
pages = "1937--1944",
journal = "Protein Science",
issn = "0961-8368",
publisher = "Cold Spring Harbor Laboratory Press",
number = "8",

}

TY - JOUR

T1 - Contact order and ab initio protein structure prediction

AU - Bonneau, Richard

AU - Ruczinski, Ingo

AU - Tsai, Jerry

AU - Baker, David

PY - 2002

Y1 - 2002

N2 - Although much of the motivation for experimental studies of protein folding is to obtain insights for improving protein structure prediction, there has been relatively little connection between experimental protein folding studies and computational structural prediction work in recent years. In the present study, we show that the relationship between protein folding rates and the contact order (CO) of the native structure has implications for ab initio protein structure prediction. Rosetta ab initio folding simulations produce a dearth of high CO structures and an excess of low CO structures, as expected if the computer simulations mimic to some extent the actual folding process. Consistent with this, the majority of failures in ab initio prediction in the CASP4 (critical assessment of structure prediction) experiment involved high CO structures likely to fold much more slowly than the lower CO structures for which reasonable predictions were made. This bias against high CO structures can be partially alleviated by performing large numbers of additional simulations, selecting out the higher CO structures, and eliminating the very low CO structure; this leads to a modest improvement in prediction quality. More significant improvements in predictions for proteins with complex topologies may be possible following significant increases in high-performance computing power, which will be required for thoroughly sampling high CO conformations (high CO proteins can take six orders of magnitude longer to fold than low CO proteins). Importantly for such strategy, simulations performed for high CO structures converge much less strongly than those for low CO structures, and hence, lack of simulation convergence can indicate the need for improved sampling of high CO conformations. The parallels between Rosetta simulations and folding in vivo may extend to misfolding: The very low CO structures that accumulate in Rosetta simulations consist primarily of local up-down β-sheets that may resemble precursors to amyloid formation.

AB - Although much of the motivation for experimental studies of protein folding is to obtain insights for improving protein structure prediction, there has been relatively little connection between experimental protein folding studies and computational structural prediction work in recent years. In the present study, we show that the relationship between protein folding rates and the contact order (CO) of the native structure has implications for ab initio protein structure prediction. Rosetta ab initio folding simulations produce a dearth of high CO structures and an excess of low CO structures, as expected if the computer simulations mimic to some extent the actual folding process. Consistent with this, the majority of failures in ab initio prediction in the CASP4 (critical assessment of structure prediction) experiment involved high CO structures likely to fold much more slowly than the lower CO structures for which reasonable predictions were made. This bias against high CO structures can be partially alleviated by performing large numbers of additional simulations, selecting out the higher CO structures, and eliminating the very low CO structure; this leads to a modest improvement in prediction quality. More significant improvements in predictions for proteins with complex topologies may be possible following significant increases in high-performance computing power, which will be required for thoroughly sampling high CO conformations (high CO proteins can take six orders of magnitude longer to fold than low CO proteins). Importantly for such strategy, simulations performed for high CO structures converge much less strongly than those for low CO structures, and hence, lack of simulation convergence can indicate the need for improved sampling of high CO conformations. The parallels between Rosetta simulations and folding in vivo may extend to misfolding: The very low CO structures that accumulate in Rosetta simulations consist primarily of local up-down β-sheets that may resemble precursors to amyloid formation.

KW - Protein folding

KW - Rosetta

KW - Structure prediction

UR - http://www.scopus.com/inward/record.url?scp=0036073603&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036073603&partnerID=8YFLogxK

U2 - 10.1110/ps.3790102

DO - 10.1110/ps.3790102

M3 - Article

VL - 11

SP - 1937

EP - 1944

JO - Protein Science

JF - Protein Science

SN - 0961-8368

IS - 8

ER -