Introduction

Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, most of the existing motif finding algorithms are computationally demanding, and they may not be able to support the increasingly large datasets produced by modern high-throughput sequencing technologies.We present FastMotif, a new motif discovery algorithm that is built on a recent machine learning technique referred to as Method of Moments. Based on spectral decompositions, our method is robust to model misspecifications and is not prone to locally optimal solutions. We obtain an algorithm that is extremely fast and designed for the analysis of big sequencing data. On HT-Selex data, FastMotif extracts motif profiles that match those computed by various state-of-the-art algorithms, but one order of magnitude faster. We provide a theoretical and numerical analysis of the algorithm's robustness and discuss its sensitivity with respect to the free parameters.The Matlab code of FastMotif is available from http://lcsb-portal.uni.lu/bioinformatics.vlassis@adobe.comSupplementary data are available at Bioinformatics online.

Publications

  1. FastMotif: spectral sequence motif discovery.
    Cite this
    Colombo N, Vlassis N, 2015-08-01 - Bioinformatics (Oxford, England)

Credits

  1. Nicoló Colombo
    Developer

    Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg

  2. Nikos Vlassis
    Investigator

    Adobe Research, San Jose, United States of America

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT006859
Tool TypeApplication
Category
PlatformsLinux/Unix
Technologies
User InterfaceTerminal Command Line
Download Count0
Country/RegionUnited States of America
Submitted ByNikos Vlassis