Fast and cheap genome wide haplotype construction via optical mapping

T. S. Anantharaman, V. Mysoref, Bhubaneswar Mishra

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We describe an efficient algorithm to construct genome wide haplotype restriction maps of an individual by aligning single molecule DNA fragments collected with Optical Mapping technology. Using this algorithm and small amount of genomic material, we can construct the parental haplotypes for each diploid chromosome for any individual. Since such haplotype maps reveal the polymorphisms due to single nucleotide differences (SNPs) and small insertions and deletions (RFLPs), they are useful in association studies, studies involving genomic instabilities in cancer, and genetics, and yet incur relatively low cost and provide high throughput. If the underlying problem is formulated as a combinatorial optimization problem, it can be shown to be NP-complete (a special case of /("-population problem). But by effectively exploiting the structure of the underlying error processes and using a novel analog of the Baum-Welch algorithm for HMM models, we devise a probabilistic algorithm with a time complexity that is linear in the number of markers for an e-approximate solution. The algorithms were tested by constructing the first genome wide haplotype restriction map of the microbe T. pseudoana, as well as constructing a haplotype restriction map of a 120 Mb region of Human chromosome 4. The frequency of false positives and false negatives was estimated using simulated data. The empirical results were found very promising.

Original languageEnglish (US)
Title of host publicationProceedings of the Pacific Symposium on Biocomputing 2005, PSB 2005
Pages385-396
Number of pages12
StatePublished - 2005
Event10th Pacific Symposium on Biocomputing, PSB 2005 - Big Island of Hawaii, United States
Duration: Jan 4 2005Jan 8 2005

Other

Other10th Pacific Symposium on Biocomputing, PSB 2005
CountryUnited States
CityBig Island of Hawaii
Period1/4/051/8/05

Fingerprint

Genes
Chromosomes
Combinatorial optimization
Nucleotides
Polymorphism
DNA
Throughput
Molecules
Costs

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Biomedical Engineering

Cite this

Anantharaman, T. S., Mysoref, V., & Mishra, B. (2005). Fast and cheap genome wide haplotype construction via optical mapping. In Proceedings of the Pacific Symposium on Biocomputing 2005, PSB 2005 (pp. 385-396)

Fast and cheap genome wide haplotype construction via optical mapping. / Anantharaman, T. S.; Mysoref, V.; Mishra, Bhubaneswar.

Proceedings of the Pacific Symposium on Biocomputing 2005, PSB 2005. 2005. p. 385-396.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Anantharaman, TS, Mysoref, V & Mishra, B 2005, Fast and cheap genome wide haplotype construction via optical mapping. in Proceedings of the Pacific Symposium on Biocomputing 2005, PSB 2005. pp. 385-396, 10th Pacific Symposium on Biocomputing, PSB 2005, Big Island of Hawaii, United States, 1/4/05.
Anantharaman TS, Mysoref V, Mishra B. Fast and cheap genome wide haplotype construction via optical mapping. In Proceedings of the Pacific Symposium on Biocomputing 2005, PSB 2005. 2005. p. 385-396
Anantharaman, T. S. ; Mysoref, V. ; Mishra, Bhubaneswar. / Fast and cheap genome wide haplotype construction via optical mapping. Proceedings of the Pacific Symposium on Biocomputing 2005, PSB 2005. 2005. pp. 385-396
@inproceedings{21968a753d764205a6c45cd13f8e96a2,
title = "Fast and cheap genome wide haplotype construction via optical mapping",
abstract = "We describe an efficient algorithm to construct genome wide haplotype restriction maps of an individual by aligning single molecule DNA fragments collected with Optical Mapping technology. Using this algorithm and small amount of genomic material, we can construct the parental haplotypes for each diploid chromosome for any individual. Since such haplotype maps reveal the polymorphisms due to single nucleotide differences (SNPs) and small insertions and deletions (RFLPs), they are useful in association studies, studies involving genomic instabilities in cancer, and genetics, and yet incur relatively low cost and provide high throughput. If the underlying problem is formulated as a combinatorial optimization problem, it can be shown to be NP-complete (a special case of /({"}-population problem). But by effectively exploiting the structure of the underlying error processes and using a novel analog of the Baum-Welch algorithm for HMM models, we devise a probabilistic algorithm with a time complexity that is linear in the number of markers for an e-approximate solution. The algorithms were tested by constructing the first genome wide haplotype restriction map of the microbe T. pseudoana, as well as constructing a haplotype restriction map of a 120 Mb region of Human chromosome 4. The frequency of false positives and false negatives was estimated using simulated data. The empirical results were found very promising.",
author = "Anantharaman, {T. S.} and V. Mysoref and Bhubaneswar Mishra",
year = "2005",
language = "English (US)",
isbn = "9812560467",
pages = "385--396",
booktitle = "Proceedings of the Pacific Symposium on Biocomputing 2005, PSB 2005",

}

TY - GEN

T1 - Fast and cheap genome wide haplotype construction via optical mapping

AU - Anantharaman, T. S.

AU - Mysoref, V.

AU - Mishra, Bhubaneswar

PY - 2005

Y1 - 2005

N2 - We describe an efficient algorithm to construct genome wide haplotype restriction maps of an individual by aligning single molecule DNA fragments collected with Optical Mapping technology. Using this algorithm and small amount of genomic material, we can construct the parental haplotypes for each diploid chromosome for any individual. Since such haplotype maps reveal the polymorphisms due to single nucleotide differences (SNPs) and small insertions and deletions (RFLPs), they are useful in association studies, studies involving genomic instabilities in cancer, and genetics, and yet incur relatively low cost and provide high throughput. If the underlying problem is formulated as a combinatorial optimization problem, it can be shown to be NP-complete (a special case of /("-population problem). But by effectively exploiting the structure of the underlying error processes and using a novel analog of the Baum-Welch algorithm for HMM models, we devise a probabilistic algorithm with a time complexity that is linear in the number of markers for an e-approximate solution. The algorithms were tested by constructing the first genome wide haplotype restriction map of the microbe T. pseudoana, as well as constructing a haplotype restriction map of a 120 Mb region of Human chromosome 4. The frequency of false positives and false negatives was estimated using simulated data. The empirical results were found very promising.

AB - We describe an efficient algorithm to construct genome wide haplotype restriction maps of an individual by aligning single molecule DNA fragments collected with Optical Mapping technology. Using this algorithm and small amount of genomic material, we can construct the parental haplotypes for each diploid chromosome for any individual. Since such haplotype maps reveal the polymorphisms due to single nucleotide differences (SNPs) and small insertions and deletions (RFLPs), they are useful in association studies, studies involving genomic instabilities in cancer, and genetics, and yet incur relatively low cost and provide high throughput. If the underlying problem is formulated as a combinatorial optimization problem, it can be shown to be NP-complete (a special case of /("-population problem). But by effectively exploiting the structure of the underlying error processes and using a novel analog of the Baum-Welch algorithm for HMM models, we devise a probabilistic algorithm with a time complexity that is linear in the number of markers for an e-approximate solution. The algorithms were tested by constructing the first genome wide haplotype restriction map of the microbe T. pseudoana, as well as constructing a haplotype restriction map of a 120 Mb region of Human chromosome 4. The frequency of false positives and false negatives was estimated using simulated data. The empirical results were found very promising.

UR - http://www.scopus.com/inward/record.url?scp=15944371003&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=15944371003&partnerID=8YFLogxK

M3 - Conference contribution

C2 - 15759644

AN - SCOPUS:15944371003

SN - 9812560467

SN - 9789812560469

SP - 385

EP - 396

BT - Proceedings of the Pacific Symposium on Biocomputing 2005, PSB 2005

ER -