Introduction

BACKGROUND: Horizontal gene transfer (HGT) is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications or mutations. For a precise characterization, algorithms are needed that identify transfer events with high reliability. Frequently, the transferred pieces of DNA have a considerable length, comprise several genes and are called genomic islands (GIs) or more specifically pathogenicity or symbiotic islands. RESULTS: We have implemented the program SIGI-HMM that predicts GIs and the putative donor of each individual alien gene. It is based on the analysis of codon usage (CU) of each individual gene of a genome under study. CU of each gene is compared against a carefully selected set of CU tables representing microbial donors or highly expressed genes. Multiple tests are used to identify putatively alien genes, to predict putative donors and to mask putatively highly expressed genes. Thus, we determine the states and emission probabilities of an inhomogeneous hidden Markov model working on gene level. For the transition probabilities, we draw upon classical test theory with the intention of integrating a sensitivity controller in a consistent manner. SIGI-HMM was written in JAVA and is publicly available. It accepts as input any file created according to the EMBL-format.It generates output in the common GFF format readable for genome browsers. Benchmark tests showed that the output of SIGI-HMM is in agreement with known findings. Its predictions were both consistent with annotated GIs and with predictions generated by different methods. CONCLUSION: SIGI-HMM is a sensitive tool for the identification of GIs in microbial genomes. It allows to interactively analyze genomes in detail and to generate or to test hypotheses about the origin of acquired genes.

Publications

  1. Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models.
    Cite this
    Waack S, Keller O, Asper R, Brodag T, Damm C, Fricke WF, Surovcik K, Meinicke P, Merkl R, 2006-01-01 - BMC bioinformatics

Credits

  1. Stephan Waack
    Developer

  2. Oliver Keller
    Developer

  3. Roman Asper
    Developer

  4. Thomas Brodag
    Developer

  5. Carsten Damm
    Developer

  6. Wolfgang Florian Fricke
    Developer

  7. Katharina Surovcik
    Developer

  8. Peter Meinicke
    Developer

  9. Rainer Merkl
    Investigator

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT001709
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesPerl
User InterfaceTerminal Command Line
Download Count0
Submitted ByRainer Merkl