Introduction

Detecting in vivo transcription factor (TF) binding is important for understanding gene regulatory circuitries. ChIP-seq is a powerful technique to empirically define TF binding in vivo. However, the multitude of distinct TFs makes genome-wide profiling for them all labor-intensive and costly. Algorithms for in silico prediction of TF binding have been developed, based mostly on histone modification or DNase I hypersensitivity data in conjunction with DNA motif and other genomic features. However, technical limitations of these methods prevent them from being applied broadly, especially in clinical settings. We conducted a comprehensive survey involving multiple cell lines, TFs, and methylation types and found that there are intimate relationships between TF binding and methylation level changes around the binding sites. Exploiting the connection between DNA methylation and TF binding, we proposed a novel supervised learning approach to predict TF-DNA interaction using data from base-resolution whole-genome methylation sequencing experiments. We devised beta-binomial models to characterize methylation data around TF binding sites and the background. Along with other static genomic features, we adopted a random forest framework to predict TF-DNA interaction. After conducting comprehensive tests, we saw that the proposed method accurately predicts TF binding and performs favorably versus competing methods.

Publications

  1. Base-resolution methylation patterns accurately predict transcription factor bindings in vivo.
    Cite this
    Xu T, Li B, Zhao M, Szulwach KE, Street RC, Lin L, Yao B, Zhang F, Jin P, Wu H, Qin ZS, 2015-03-01 - Nucleic acids research

Credits

  1. Tianlei Xu
    Developer

    Department of Mathematics and Computer Science, Emory University, United States of America

  2. Ben Li
    Developer

    Department of Biostatistics and Bioinformatics, Rollins School of Public Health, United States of America

  3. Meng Zhao
    Developer

    Department of Biostatistics and Bioinformatics, Rollins School of Public Health, United States of America

  4. Keith E Szulwach
    Developer

    Department of Human Genetics, Emory University, United States of America

  5. R Craig Street
    Developer

    Department of Human Genetics, Emory University, United States of America

  6. Li Lin
    Developer

    Department of Human Genetics, Emory University, United States of America

  7. Bing Yao
    Developer

    Department of Human Genetics, Emory University, United States of America

  8. Feiran Zhang
    Developer

    Department of Human Genetics, Emory University, United States of America

  9. Peng Jin
    Developer

    Department of Human Genetics, Emory University, United States of America

  10. Hao Wu
    Developer

    Department of Biostatistics and Bioinformatics, Rollins School of Public Health, United States of America

  11. Zhaohui S Qin
    Investigator

    Department of Biostatistics and Bioinformatics, Rollins School of Public Health, United States of America

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT000955
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesR
User InterfaceTerminal Command Line
Download Count0
Country/RegionUnited States of America
Submitted ByZhaohui S Qin