The SiteSeeker motif discovery tool.

Klaus Ecker, Jens Lichtenberg, Lonnie Welch
Author Information
  1. Klaus Ecker: Russ College of Engineering and Technology, Ohio University Athens, Ohio, USA. ecker@ohio.edu

Abstract

In this paper we describe some utilizing conditions of a recently published tool that offers two basic functions for the classical problem of discovering motifs in a set of promoter sequences. For the first it is assumed that not necessarily all of the sequences possess a common motif of given length l. In this case, CHECKPROMOTER allows an exact identification of maximal subsets of related promoters. The purpose of this program is to recognize putatively co-regulated genes. The second, CHECKMOTIF, solves the problem of checking if the given promoters have a common motif. It uses a fast approximation algorithm for which we were able to derive non-trivial low performance bounds (defined as the ratio of Hamming distance of the obtained solution to that of a theoretically best solution) for the computed outputs. Both programs use a novel weighted Hamming distance paradigm for evaluating the similarity of sets of l-mers, and we are able to compute performance bounds for the proposed motifs. A set of At promoters were used as a benchmark for a comparative test against five known tools. It could be verified that SiteSeeker significantly outperformed these tools.

MeSH Term

Algorithms
Arabidopsis
Computational Biology
Gene Expression Regulation, Plant
Promoter Regions, Genetic
Regulatory Sequences, Nucleic Acid

Word Cloud

Created with Highcharts 10.0.0motifpromoterstoolproblemmotifssetsequencescommongivenableperformanceboundsHammingdistancesolutiontoolsSiteSeekerpaperdescribeutilizingconditionsrecentlypublishedofferstwobasicfunctionsclassicaldiscoveringpromoterfirstassumednecessarilypossesslengthlcaseCHECKPROMOTERallowsexactidentificationmaximalsubsetsrelatedpurposeprogramrecognizeputativelyco-regulatedgenessecondCHECKMOTIFsolvescheckingusesfastapproximationalgorithmderivenon-triviallowdefinedratioobtainedtheoreticallybestcomputedoutputsprogramsusenovelweightedparadigmevaluatingsimilaritysetsl-merscomputeproposedusedbenchmarkcomparativetestfiveknownverifiedsignificantlyoutperformeddiscovery

Similar Articles

Cited By (1)