HeliCis: a DNA motif discovery tool for colocalized motif pairs with periodic spacing.

Erik Larsson, Per Lindahl, Petter Mostad
Author Information
  1. Erik Larsson: Wallenberg Laboratory for Cardiovascular Research, Bruna Stråket 16, Sahlgrenska University Hospital, SE-413 45 Göteborg, SWEDEN. erik.larsson@wlab.gu.se

Abstract

BACKGROUND: Correct temporal and spatial gene expression during metazoan development relies on combinatorial interactions between different transcription factors. As a consequence, cis-regulatory elements often colocalize in clusters termed cis-regulatory modules. These may have requirements on organizational features such as spacing, order and helical phasing (periodic spacing) between binding sites. Due to the turning of the DNA helix, a small modification of the distance between a pair of sites may sometimes drastically disrupt function, while insertion of a full helical turn of DNA (10-11 bp) between cis elements may cause functionality to be restored. Recently, de novo motif discovery methods which incorporate organizational properties such as colocalization and order preferences have been developed, but there are no tools which incorporate periodic spacing into the model.
RESULTS: We have developed a web based motif discovery tool, HeliCis, which features a flexible model which allows de novo detection of motifs with periodic spacing. Depending on the parameter settings it may also be used for discovering colocalized motifs without periodicity or motifs separated by a fixed gap of known or unknown length. We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs. It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure.
CONCLUSION: HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing. Our evaluations show that it can detect weak periodic patterns which are not easily discovered using a sequential approach, i.e. first finding the binding sites and second analyzing the properties of their pairwise distances.

References

  1. Nucleic Acids Res. 2003 Oct 15;31(20):6016-26 [PMID: 14530449]
  2. J Immunol. 2005 Oct 1;175(7):4499-507 [PMID: 16177093]
  3. Proc Natl Acad Sci U S A. 2004 Aug 17;101(33):12114-9 [PMID: 15297614]
  4. Proc Int Conf Intell Syst Mol Biol. 1994;2:28-36 [PMID: 7584402]
  5. Nature. 2004 Mar 11;428(6979):185-9 [PMID: 15014501]
  6. J Biotechnol. 1994 Jun 30;35(2-3):273-80 [PMID: 7765063]
  7. Development. 2003 Dec;130(26):6569-75 [PMID: 14660545]
  8. EMBO Rep. 2000 Nov;1(5):422-7 [PMID: 11258482]
  9. Genome Res. 2004 Oct;14(10A):1967-74 [PMID: 15466295]
  10. Nucleic Acids Res. 2003 Jul 1;31(13):3580-5 [PMID: 12824370]
  11. Pac Symp Biocomput. 2001;:127-38 [PMID: 11262934]
  12. J Biol Chem. 2002 Jan 4;277(1):816-22 [PMID: 11641401]
  13. J Comput Biol. 2005 Jul-Aug;12(6):822-34 [PMID: 16108719]
  14. Protein Sci. 1995 Aug;4(8):1618-32 [PMID: 8520488]
  15. Proc Natl Acad Sci U S A. 1999 Mar 16;96(6):2891-5 [PMID: 10077607]
  16. J Comput Biol. 2000;7(3-4):345-62 [PMID: 11108467]
  17. Mol Cell Biol. 2003 Jan;23(2):526-33 [PMID: 12509451]
  18. Proc Natl Acad Sci U S A. 2005 May 17;102(20):7079-84 [PMID: 15883375]
  19. Nat Biotechnol. 2005 Jan;23(1):137-44 [PMID: 15637633]
  20. Science. 1993 Oct 8;262(5131):208-14 [PMID: 8211139]
  21. Circ Res. 2000 Feb 4;86(2):221-32 [PMID: 10666419]
  22. Gene. 2002 Jan 9;282(1-2):103-11 [PMID: 11814682]
  23. Proc Natl Acad Sci U S A. 1979 Jan;76(1):200-3 [PMID: 284332]
  24. Proc Natl Acad Sci U S A. 2004 Mar 16;101(11):3851-6 [PMID: 15026577]
  25. Cell. 1998 Jan 9;92(1):5-8 [PMID: 9489694]

MeSH Term

Algorithms
Amino Acid Motifs
Animals
Base Sequence
Binding Sites
Computer Simulation
DNA
Databases, Genetic
Models, Genetic
Models, Structural
Pattern Recognition, Automated
Regulatory Elements, Transcriptional
Regulatory Sequences, Nucleic Acid
Sensitivity and Specificity
Sequence Analysis, DNA
Software

Chemicals

DNA

Word Cloud

Created with Highcharts 10.0.0spacingperiodicDNAmotifmaydiscoverymotifssitesdenovomodelcolocalizedcis-regulatoryelementsorganizationalfeaturesorderhelicalbindingincorporatepropertiescolocalizationdevelopedwebtoolHeliCisdetectionsettingswithoutshowcanweakprovidessimplepairsBACKGROUND:CorrecttemporalspatialgeneexpressionmetazoandevelopmentreliescombinatorialinteractionsdifferenttranscriptionfactorsconsequenceoftencolocalizeclusterstermedmodulesrequirementsphasingDueturninghelixsmallmodificationdistancepairsometimesdrasticallydisruptfunctioninsertionfullturn10-11bpciscausefunctionalityrestoredRecentlymethodspreferencestoolsRESULTS:basedflexibleallowsDependingparameteralsouseddiscoveringperiodicityseparatedfixedgapknownunknownlengthsimulateddataefficientlycapturesynergisticeffectsimproveuseinterfaceinteractivelyvisualizescurrenttherebymakeseasyunderstandparametersstructureCONCLUSION:efficientevaluationsdetectpatternseasilydiscoveredusingsequentialapproachiefirstfindingsecondanalyzingpairwisedistancesHeliCis:

Similar Articles

Cited By