Using least median of squares for structural superposition of flexible proteins

Yu Shen Liu, Yi Fang, Karthik Ramani

Research output: Contribution to journalArticle

Abstract

Background: The conventional superposition methods use an ordinary least squares (LS) fit for structural comparison of two different conformations of the same protein. The main problem of the LS fit that it is sensitive to outliers, i.e. large displacements of the original structures superimposed. Results: To overcome this problem, we present a new algorithm to overlap two protein conformations by their atomic coordinates using a robust statistics technique: least median of squares (LMS). In order to effectively approximate the LMS optimization, the forward search technique is utilized. Our algorithm can automatically detect and superimpose the rigid core regions of two conformations with small or large displacements. In contrast, most existing superposition techniques strongly depend on the initial LS estimating for the entire atom sets of proteins. They may fail on structural superposition of two conformations with large displacements. The presented LMS fit can be considered as an alternative and complementary tool for structural superposition. Conclusion: The proposed algorithm is robust and does not require any prior knowledge of the flexible regions. Furthermore, we show that the LMS fit can be extended to multiple level superposition between two conformations with several rigid domains. Our fit tool has produced successful superpositions when applied to proteins for which two conformations are known. The binary executable program for Windows platform, tested examples, and database are available from https://engineering.purdue.edu/PRECISE/LMSfit.

Original languageEnglish (US)
Article number29
JournalBMC Bioinformatics
Volume10
DOIs
StatePublished - Jan 22 2009

Fingerprint

Least Median of Squares
Least-Squares Analysis
Conformation
Superposition
Conformations
Proteins
Protein
Large Displacements
Protein Conformation
Least Squares
Forward Search
Robust Statistics
Ordinary Least Squares
Prior Knowledge
Outlier
Overlap
Statistics
Entire
Binary
Engineering

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Using least median of squares for structural superposition of flexible proteins. / Liu, Yu Shen; Fang, Yi; Ramani, Karthik.

In: BMC Bioinformatics, Vol. 10, 29, 22.01.2009.

Research output: Contribution to journalArticle

@article{58b6862cc4494fea9f6f1a7e01f8e3ec,
title = "Using least median of squares for structural superposition of flexible proteins",
abstract = "Background: The conventional superposition methods use an ordinary least squares (LS) fit for structural comparison of two different conformations of the same protein. The main problem of the LS fit that it is sensitive to outliers, i.e. large displacements of the original structures superimposed. Results: To overcome this problem, we present a new algorithm to overlap two protein conformations by their atomic coordinates using a robust statistics technique: least median of squares (LMS). In order to effectively approximate the LMS optimization, the forward search technique is utilized. Our algorithm can automatically detect and superimpose the rigid core regions of two conformations with small or large displacements. In contrast, most existing superposition techniques strongly depend on the initial LS estimating for the entire atom sets of proteins. They may fail on structural superposition of two conformations with large displacements. The presented LMS fit can be considered as an alternative and complementary tool for structural superposition. Conclusion: The proposed algorithm is robust and does not require any prior knowledge of the flexible regions. Furthermore, we show that the LMS fit can be extended to multiple level superposition between two conformations with several rigid domains. Our fit tool has produced successful superpositions when applied to proteins for which two conformations are known. The binary executable program for Windows platform, tested examples, and database are available from https://engineering.purdue.edu/PRECISE/LMSfit.",
author = "Liu, {Yu Shen} and Yi Fang and Karthik Ramani",
year = "2009",
month = "1",
day = "22",
doi = "10.1186/1471-2105-10-29",
language = "English (US)",
volume = "10",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Using least median of squares for structural superposition of flexible proteins

AU - Liu, Yu Shen

AU - Fang, Yi

AU - Ramani, Karthik

PY - 2009/1/22

Y1 - 2009/1/22

N2 - Background: The conventional superposition methods use an ordinary least squares (LS) fit for structural comparison of two different conformations of the same protein. The main problem of the LS fit that it is sensitive to outliers, i.e. large displacements of the original structures superimposed. Results: To overcome this problem, we present a new algorithm to overlap two protein conformations by their atomic coordinates using a robust statistics technique: least median of squares (LMS). In order to effectively approximate the LMS optimization, the forward search technique is utilized. Our algorithm can automatically detect and superimpose the rigid core regions of two conformations with small or large displacements. In contrast, most existing superposition techniques strongly depend on the initial LS estimating for the entire atom sets of proteins. They may fail on structural superposition of two conformations with large displacements. The presented LMS fit can be considered as an alternative and complementary tool for structural superposition. Conclusion: The proposed algorithm is robust and does not require any prior knowledge of the flexible regions. Furthermore, we show that the LMS fit can be extended to multiple level superposition between two conformations with several rigid domains. Our fit tool has produced successful superpositions when applied to proteins for which two conformations are known. The binary executable program for Windows platform, tested examples, and database are available from https://engineering.purdue.edu/PRECISE/LMSfit.

AB - Background: The conventional superposition methods use an ordinary least squares (LS) fit for structural comparison of two different conformations of the same protein. The main problem of the LS fit that it is sensitive to outliers, i.e. large displacements of the original structures superimposed. Results: To overcome this problem, we present a new algorithm to overlap two protein conformations by their atomic coordinates using a robust statistics technique: least median of squares (LMS). In order to effectively approximate the LMS optimization, the forward search technique is utilized. Our algorithm can automatically detect and superimpose the rigid core regions of two conformations with small or large displacements. In contrast, most existing superposition techniques strongly depend on the initial LS estimating for the entire atom sets of proteins. They may fail on structural superposition of two conformations with large displacements. The presented LMS fit can be considered as an alternative and complementary tool for structural superposition. Conclusion: The proposed algorithm is robust and does not require any prior knowledge of the flexible regions. Furthermore, we show that the LMS fit can be extended to multiple level superposition between two conformations with several rigid domains. Our fit tool has produced successful superpositions when applied to proteins for which two conformations are known. The binary executable program for Windows platform, tested examples, and database are available from https://engineering.purdue.edu/PRECISE/LMSfit.

UR - http://www.scopus.com/inward/record.url?scp=60649092330&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=60649092330&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-10-29

DO - 10.1186/1471-2105-10-29

M3 - Article

C2 - 19159484

AN - SCOPUS:60649092330

VL - 10

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 29

ER -