Mocap

Large-scale inference of transcription factor binding sites from chromatin accessibility

Xi Chen, Bowen Yu, Nicholas Carriero, Claudio Silva, Richard Bonneau

Research output: Contribution to journalArticle

Abstract

Differential binding of transcription factors (TFs) at cis-regulatory loci drives the differentiation and function of diverse cellular lineages. Understanding the regulatory interactions that underlie cell fate decisions requires characterizing TF binding sites (TFBS) across multiple cell types and conditions. Techniques, e.g. ChIP-Seq can reveal genome-wide patterns of TF binding, but typically requires laborious and costly experiments for each TF-cell-type (TFCT) condition of interest. Chromosomal accessibility assays can connect accessible chromatin in one cell type to many TFs through sequence motif mapping. Suchmethods, however, rarely take into account that the genomic context preferred by each factor differs from TF to TF, and from cell type to cell type. To address the differences in TF behaviors, we developed Mocap, a method that integrates chromatin accessibility, motif scores, TF footprints, CpG/GC content, evolutionary conservation and other factors in an ensemble of TFCT-specific classifiers. We show that integration of genomic features, such as CpG islands improves TFBS prediction in some TFCT. Further, we describe a method for mapping new TFCT, for which no ChIP-seq data exists, onto our ensemble of classifiers and show that our cross-sample TFBS prediction method outperforms several previously described methods.

Original languageEnglish (US)
Pages (from-to)4315-4329
Number of pages15
JournalNucleic Acids Research
Volume45
Issue number8
DOIs
StatePublished - 2017

Fingerprint

Chromatin
Transcription Factors
Binding Sites
ethoprop
CpG Islands
Base Composition
Cell Communication
Genome

ASJC Scopus subject areas

  • Genetics

Cite this

Mocap : Large-scale inference of transcription factor binding sites from chromatin accessibility. / Chen, Xi; Yu, Bowen; Carriero, Nicholas; Silva, Claudio; Bonneau, Richard.

In: Nucleic Acids Research, Vol. 45, No. 8, 2017, p. 4315-4329.

Research output: Contribution to journalArticle

@article{eefa399262b446dfac43d48182a889b4,
title = "Mocap: Large-scale inference of transcription factor binding sites from chromatin accessibility",
abstract = "Differential binding of transcription factors (TFs) at cis-regulatory loci drives the differentiation and function of diverse cellular lineages. Understanding the regulatory interactions that underlie cell fate decisions requires characterizing TF binding sites (TFBS) across multiple cell types and conditions. Techniques, e.g. ChIP-Seq can reveal genome-wide patterns of TF binding, but typically requires laborious and costly experiments for each TF-cell-type (TFCT) condition of interest. Chromosomal accessibility assays can connect accessible chromatin in one cell type to many TFs through sequence motif mapping. Suchmethods, however, rarely take into account that the genomic context preferred by each factor differs from TF to TF, and from cell type to cell type. To address the differences in TF behaviors, we developed Mocap, a method that integrates chromatin accessibility, motif scores, TF footprints, CpG/GC content, evolutionary conservation and other factors in an ensemble of TFCT-specific classifiers. We show that integration of genomic features, such as CpG islands improves TFBS prediction in some TFCT. Further, we describe a method for mapping new TFCT, for which no ChIP-seq data exists, onto our ensemble of classifiers and show that our cross-sample TFBS prediction method outperforms several previously described methods.",
author = "Xi Chen and Bowen Yu and Nicholas Carriero and Claudio Silva and Richard Bonneau",
year = "2017",
doi = "10.1093/nar/gkx174",
language = "English (US)",
volume = "45",
pages = "4315--4329",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "8",

}

TY - JOUR

T1 - Mocap

T2 - Large-scale inference of transcription factor binding sites from chromatin accessibility

AU - Chen, Xi

AU - Yu, Bowen

AU - Carriero, Nicholas

AU - Silva, Claudio

AU - Bonneau, Richard

PY - 2017

Y1 - 2017

N2 - Differential binding of transcription factors (TFs) at cis-regulatory loci drives the differentiation and function of diverse cellular lineages. Understanding the regulatory interactions that underlie cell fate decisions requires characterizing TF binding sites (TFBS) across multiple cell types and conditions. Techniques, e.g. ChIP-Seq can reveal genome-wide patterns of TF binding, but typically requires laborious and costly experiments for each TF-cell-type (TFCT) condition of interest. Chromosomal accessibility assays can connect accessible chromatin in one cell type to many TFs through sequence motif mapping. Suchmethods, however, rarely take into account that the genomic context preferred by each factor differs from TF to TF, and from cell type to cell type. To address the differences in TF behaviors, we developed Mocap, a method that integrates chromatin accessibility, motif scores, TF footprints, CpG/GC content, evolutionary conservation and other factors in an ensemble of TFCT-specific classifiers. We show that integration of genomic features, such as CpG islands improves TFBS prediction in some TFCT. Further, we describe a method for mapping new TFCT, for which no ChIP-seq data exists, onto our ensemble of classifiers and show that our cross-sample TFBS prediction method outperforms several previously described methods.

AB - Differential binding of transcription factors (TFs) at cis-regulatory loci drives the differentiation and function of diverse cellular lineages. Understanding the regulatory interactions that underlie cell fate decisions requires characterizing TF binding sites (TFBS) across multiple cell types and conditions. Techniques, e.g. ChIP-Seq can reveal genome-wide patterns of TF binding, but typically requires laborious and costly experiments for each TF-cell-type (TFCT) condition of interest. Chromosomal accessibility assays can connect accessible chromatin in one cell type to many TFs through sequence motif mapping. Suchmethods, however, rarely take into account that the genomic context preferred by each factor differs from TF to TF, and from cell type to cell type. To address the differences in TF behaviors, we developed Mocap, a method that integrates chromatin accessibility, motif scores, TF footprints, CpG/GC content, evolutionary conservation and other factors in an ensemble of TFCT-specific classifiers. We show that integration of genomic features, such as CpG islands improves TFBS prediction in some TFCT. Further, we describe a method for mapping new TFCT, for which no ChIP-seq data exists, onto our ensemble of classifiers and show that our cross-sample TFBS prediction method outperforms several previously described methods.

UR - http://www.scopus.com/inward/record.url?scp=85020166402&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85020166402&partnerID=8YFLogxK

U2 - 10.1093/nar/gkx174

DO - 10.1093/nar/gkx174

M3 - Article

VL - 45

SP - 4315

EP - 4329

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 8

ER -