Jitterbug

Somatic and germline transposon insertion detection at single-nucleotide resolution

Elizabeth Henaff, Luís Zapata, Josep M. Casacuberta, Stephan Ossowski

Research output: Contribution to journalArticle

Abstract

Background: Transposable elements are major players in genome evolution. Transposon insertion polymorphisms can translate into phenotypic differences in plants and animals and are linked to different diseases including human cancer, making their characterization highly relevant to the study of genome evolution and genetic diseases. Results: Here we present Jitterbug, a novel tool that identifies transposable element insertion sites at single-nucleotide resolution based on the pairedend mapping and clipped-read signatures produced by NGS alignments. Jitterbug can be easily integrated into existing NGS analysis pipelines, using the standard BAM format produced by frequently applied alignment tools (e.g. bwa, bowtie2), with no need to realign reads to a set of consensus transposon sequences. Jitterbug is highly sensitive and able to recall transposon insertions with a very high specificity, as demonstrated by benchmarks in the human and Arabidopsis genomes, and validation using long PacBio reads. In addition, Jitterbug estimates the zygosity of transposon insertions with high accuracy and can also identify somatic insertions. Conclusions: We demonstrate that Jitterbug can identify mosaic somatic transposon movement using sequenced tumor-normal sample pairs and allows for estimating the cancer cell fraction of clones containing a somatic TE insertion. We suggest that the independent methods we use to evaluate performance are a step towards creating a gold standard dataset for benchmarking structural variant prediction tools.

Original languageEnglish (US)
Article number768
JournalBMC Genomics
Volume16
Issue number1
DOIs
StatePublished - Oct 12 2015

Fingerprint

Benchmarking
DNA Transposable Elements
Nucleotides
Genome
Neoplasms
Inborn Genetic Diseases
Consensus Sequence
Human Genome
Arabidopsis
Clone Cells
Datasets

Keywords

  • Cancer
  • Evolution
  • Genomics
  • Mobile elements
  • NGS
  • Somatic mutation
  • Structural variation
  • Transposons

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

Jitterbug : Somatic and germline transposon insertion detection at single-nucleotide resolution. / Henaff, Elizabeth; Zapata, Luís; Casacuberta, Josep M.; Ossowski, Stephan.

In: BMC Genomics, Vol. 16, No. 1, 768, 12.10.2015.

Research output: Contribution to journalArticle

Henaff, Elizabeth ; Zapata, Luís ; Casacuberta, Josep M. ; Ossowski, Stephan. / Jitterbug : Somatic and germline transposon insertion detection at single-nucleotide resolution. In: BMC Genomics. 2015 ; Vol. 16, No. 1.
@article{c04820c855654381a97ea513adc16327,
title = "Jitterbug: Somatic and germline transposon insertion detection at single-nucleotide resolution",
abstract = "Background: Transposable elements are major players in genome evolution. Transposon insertion polymorphisms can translate into phenotypic differences in plants and animals and are linked to different diseases including human cancer, making their characterization highly relevant to the study of genome evolution and genetic diseases. Results: Here we present Jitterbug, a novel tool that identifies transposable element insertion sites at single-nucleotide resolution based on the pairedend mapping and clipped-read signatures produced by NGS alignments. Jitterbug can be easily integrated into existing NGS analysis pipelines, using the standard BAM format produced by frequently applied alignment tools (e.g. bwa, bowtie2), with no need to realign reads to a set of consensus transposon sequences. Jitterbug is highly sensitive and able to recall transposon insertions with a very high specificity, as demonstrated by benchmarks in the human and Arabidopsis genomes, and validation using long PacBio reads. In addition, Jitterbug estimates the zygosity of transposon insertions with high accuracy and can also identify somatic insertions. Conclusions: We demonstrate that Jitterbug can identify mosaic somatic transposon movement using sequenced tumor-normal sample pairs and allows for estimating the cancer cell fraction of clones containing a somatic TE insertion. We suggest that the independent methods we use to evaluate performance are a step towards creating a gold standard dataset for benchmarking structural variant prediction tools.",
keywords = "Cancer, Evolution, Genomics, Mobile elements, NGS, Somatic mutation, Structural variation, Transposons",
author = "Elizabeth Henaff and Lu{\'i}s Zapata and Casacuberta, {Josep M.} and Stephan Ossowski",
year = "2015",
month = "10",
day = "12",
doi = "10.1186/s12864-015-1975-5",
language = "English (US)",
volume = "16",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Jitterbug

T2 - Somatic and germline transposon insertion detection at single-nucleotide resolution

AU - Henaff, Elizabeth

AU - Zapata, Luís

AU - Casacuberta, Josep M.

AU - Ossowski, Stephan

PY - 2015/10/12

Y1 - 2015/10/12

N2 - Background: Transposable elements are major players in genome evolution. Transposon insertion polymorphisms can translate into phenotypic differences in plants and animals and are linked to different diseases including human cancer, making their characterization highly relevant to the study of genome evolution and genetic diseases. Results: Here we present Jitterbug, a novel tool that identifies transposable element insertion sites at single-nucleotide resolution based on the pairedend mapping and clipped-read signatures produced by NGS alignments. Jitterbug can be easily integrated into existing NGS analysis pipelines, using the standard BAM format produced by frequently applied alignment tools (e.g. bwa, bowtie2), with no need to realign reads to a set of consensus transposon sequences. Jitterbug is highly sensitive and able to recall transposon insertions with a very high specificity, as demonstrated by benchmarks in the human and Arabidopsis genomes, and validation using long PacBio reads. In addition, Jitterbug estimates the zygosity of transposon insertions with high accuracy and can also identify somatic insertions. Conclusions: We demonstrate that Jitterbug can identify mosaic somatic transposon movement using sequenced tumor-normal sample pairs and allows for estimating the cancer cell fraction of clones containing a somatic TE insertion. We suggest that the independent methods we use to evaluate performance are a step towards creating a gold standard dataset for benchmarking structural variant prediction tools.

AB - Background: Transposable elements are major players in genome evolution. Transposon insertion polymorphisms can translate into phenotypic differences in plants and animals and are linked to different diseases including human cancer, making their characterization highly relevant to the study of genome evolution and genetic diseases. Results: Here we present Jitterbug, a novel tool that identifies transposable element insertion sites at single-nucleotide resolution based on the pairedend mapping and clipped-read signatures produced by NGS alignments. Jitterbug can be easily integrated into existing NGS analysis pipelines, using the standard BAM format produced by frequently applied alignment tools (e.g. bwa, bowtie2), with no need to realign reads to a set of consensus transposon sequences. Jitterbug is highly sensitive and able to recall transposon insertions with a very high specificity, as demonstrated by benchmarks in the human and Arabidopsis genomes, and validation using long PacBio reads. In addition, Jitterbug estimates the zygosity of transposon insertions with high accuracy and can also identify somatic insertions. Conclusions: We demonstrate that Jitterbug can identify mosaic somatic transposon movement using sequenced tumor-normal sample pairs and allows for estimating the cancer cell fraction of clones containing a somatic TE insertion. We suggest that the independent methods we use to evaluate performance are a step towards creating a gold standard dataset for benchmarking structural variant prediction tools.

KW - Cancer

KW - Evolution

KW - Genomics

KW - Mobile elements

KW - NGS

KW - Somatic mutation

KW - Structural variation

KW - Transposons

UR - http://www.scopus.com/inward/record.url?scp=84959103969&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84959103969&partnerID=8YFLogxK

U2 - 10.1186/s12864-015-1975-5

DO - 10.1186/s12864-015-1975-5

M3 - Article

VL - 16

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - 1

M1 - 768

ER -