4C-ker: A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments

Ramya Raviram, Pedro P. Rocha, Christian L. Müller, Emily R. Miraldi, Sana Badri, Yi Fu, Emily Swanzey, Charlotte Proudhon, Valentina Snetkova, Richard Bonneau, Jane A. Skok

Research output: Contribution to journalArticle

Abstract

4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or “bait”) that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes.

Original languageEnglish (US)
Article numbere1004780
JournalPLoS Computational Biology
Volume12
Issue number3
DOIs
StatePublished - Mar 1 2016

Fingerprint

bait
baits
Genome
genome
Genes
Enzymes
Interaction
enzyme
Experiment
experiment
Experiments
Locus
Pipelines
enzymes
Restriction
methodology
Gene Regulation
Chromatin
Hidden Markov models
Chromosomes

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Modeling and Simulation
  • Ecology, Evolution, Behavior and Systematics
  • Genetics
  • Molecular Biology
  • Ecology
  • Cellular and Molecular Neuroscience

Cite this

Raviram, R., Rocha, P. P., Müller, C. L., Miraldi, E. R., Badri, S., Fu, Y., ... Skok, J. A. (2016). 4C-ker: A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments. PLoS Computational Biology, 12(3), [e1004780]. https://doi.org/10.1371/journal.pcbi.1004780

4C-ker : A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments. / Raviram, Ramya; Rocha, Pedro P.; Müller, Christian L.; Miraldi, Emily R.; Badri, Sana; Fu, Yi; Swanzey, Emily; Proudhon, Charlotte; Snetkova, Valentina; Bonneau, Richard; Skok, Jane A.

In: PLoS Computational Biology, Vol. 12, No. 3, e1004780, 01.03.2016.

Research output: Contribution to journalArticle

Raviram, R, Rocha, PP, Müller, CL, Miraldi, ER, Badri, S, Fu, Y, Swanzey, E, Proudhon, C, Snetkova, V, Bonneau, R & Skok, JA 2016, '4C-ker: A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments', PLoS Computational Biology, vol. 12, no. 3, e1004780. https://doi.org/10.1371/journal.pcbi.1004780
Raviram, Ramya ; Rocha, Pedro P. ; Müller, Christian L. ; Miraldi, Emily R. ; Badri, Sana ; Fu, Yi ; Swanzey, Emily ; Proudhon, Charlotte ; Snetkova, Valentina ; Bonneau, Richard ; Skok, Jane A. / 4C-ker : A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments. In: PLoS Computational Biology. 2016 ; Vol. 12, No. 3.
@article{ed71fba132cc4860922451aad38762d2,
title = "4C-ker: A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments",
abstract = "4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or “bait”) that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes.",
author = "Ramya Raviram and Rocha, {Pedro P.} and M{\"u}ller, {Christian L.} and Miraldi, {Emily R.} and Sana Badri and Yi Fu and Emily Swanzey and Charlotte Proudhon and Valentina Snetkova and Richard Bonneau and Skok, {Jane A.}",
year = "2016",
month = "3",
day = "1",
doi = "10.1371/journal.pcbi.1004780",
language = "English (US)",
volume = "12",
journal = "PLoS Computational Biology",
issn = "1553-734X",
publisher = "Public Library of Science",
number = "3",

}

TY - JOUR

T1 - 4C-ker

T2 - A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments

AU - Raviram, Ramya

AU - Rocha, Pedro P.

AU - Müller, Christian L.

AU - Miraldi, Emily R.

AU - Badri, Sana

AU - Fu, Yi

AU - Swanzey, Emily

AU - Proudhon, Charlotte

AU - Snetkova, Valentina

AU - Bonneau, Richard

AU - Skok, Jane A.

PY - 2016/3/1

Y1 - 2016/3/1

N2 - 4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or “bait”) that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes.

AB - 4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or “bait”) that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes.

UR - http://www.scopus.com/inward/record.url?scp=84962106630&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84962106630&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1004780

DO - 10.1371/journal.pcbi.1004780

M3 - Article

C2 - 26938081

AN - SCOPUS:84962106630

VL - 12

JO - PLoS Computational Biology

JF - PLoS Computational Biology

SN - 1553-734X

IS - 3

M1 - e1004780

ER -