AntiClustal

Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.

C. Di Pietro, V. Di Pietro, G. Emmanuele, A. Ferro, T. Maugeri, E. Modica, G. Pigola, A. Pulvirenti, M. Purrello, M. Ragusa, M. Scalia, D. Shasha, S. Travali, V. Zimmitti

Research output: Contribution to journalArticle

Abstract

In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.

Original languageEnglish (US)
Pages (from-to)326-336
Number of pages11
JournalProceedings / IEEE Computer Society Bioinformatics Conference. IEEE Computer Society Bioinformatics Conference.
Volume2
StatePublished - 2003

Fingerprint

Sequence Alignment
Cluster Analysis
Xenopus laevis
Sequence Homology
Running

Cite this

AntiClustal : Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation. / Di Pietro, C.; Di Pietro, V.; Emmanuele, G.; Ferro, A.; Maugeri, T.; Modica, E.; Pigola, G.; Pulvirenti, A.; Purrello, M.; Ragusa, M.; Scalia, M.; Shasha, D.; Travali, S.; Zimmitti, V.

In: Proceedings / IEEE Computer Society Bioinformatics Conference. IEEE Computer Society Bioinformatics Conference., Vol. 2, 2003, p. 326-336.

Research output: Contribution to journalArticle

Di Pietro, C, Di Pietro, V, Emmanuele, G, Ferro, A, Maugeri, T, Modica, E, Pigola, G, Pulvirenti, A, Purrello, M, Ragusa, M, Scalia, M, Shasha, D, Travali, S & Zimmitti, V 2003, 'AntiClustal: Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.', Proceedings / IEEE Computer Society Bioinformatics Conference. IEEE Computer Society Bioinformatics Conference., vol. 2, pp. 326-336.
Di Pietro, C. ; Di Pietro, V. ; Emmanuele, G. ; Ferro, A. ; Maugeri, T. ; Modica, E. ; Pigola, G. ; Pulvirenti, A. ; Purrello, M. ; Ragusa, M. ; Scalia, M. ; Shasha, D. ; Travali, S. ; Zimmitti, V. / AntiClustal : Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation. In: Proceedings / IEEE Computer Society Bioinformatics Conference. IEEE Computer Society Bioinformatics Conference. 2003 ; Vol. 2. pp. 326-336.
@article{2c7d78a11f684624955880385300de4b,
title = "AntiClustal: Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.",
abstract = "In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.",
author = "{Di Pietro}, C. and {Di Pietro}, V. and G. Emmanuele and A. Ferro and T. Maugeri and E. Modica and G. Pigola and A. Pulvirenti and M. Purrello and M. Ragusa and M. Scalia and D. Shasha and S. Travali and V. Zimmitti",
year = "2003",
language = "English (US)",
volume = "2",
pages = "326--336",
journal = "Proceedings / IEEE Computer Society Bioinformatics Conference. IEEE Computer Society Bioinformatics Conference.",
issn = "1555-3930",

}

TY - JOUR

T1 - AntiClustal

T2 - Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.

AU - Di Pietro, C.

AU - Di Pietro, V.

AU - Emmanuele, G.

AU - Ferro, A.

AU - Maugeri, T.

AU - Modica, E.

AU - Pigola, G.

AU - Pulvirenti, A.

AU - Purrello, M.

AU - Ragusa, M.

AU - Scalia, M.

AU - Shasha, D.

AU - Travali, S.

AU - Zimmitti, V.

PY - 2003

Y1 - 2003

N2 - In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.

AB - In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.

UR - http://www.scopus.com/inward/record.url?scp=33746838927&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33746838927&partnerID=8YFLogxK

M3 - Article

VL - 2

SP - 326

EP - 336

JO - Proceedings / IEEE Computer Society Bioinformatics Conference. IEEE Computer Society Bioinformatics Conference.

JF - Proceedings / IEEE Computer Society Bioinformatics Conference. IEEE Computer Society Bioinformatics Conference.

SN - 1555-3930

ER -