Flexibility and symmetry of prokaryotic genome rearrangement reveal lineage-associated core-gene-defined genome organizational frameworks.

Yu Kang, Chaohao Gu, Lina Yuan, Yue Wang, Yanmin Zhu, Xinna Li, Qibin Luo, Jingfa Xiao, Daquan Jiang, Minping Qian, Aftab Ahmed Khan, Fei Chen, Zhang Zhang, Jun Yu
Author Information
  1. Yu Kang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China.
  2. Chaohao Gu: College of Computer Science, Sichuan University, Chengdu, People's Republic of China.
  3. Lina Yuan: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China.
  4. Yue Wang: LMAM, School of Mathematical Sciences, Peking University, Beijing, People's Republic of China.
  5. Yanmin Zhu: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China.
  6. Xinna Li: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China.
  7. Qibin Luo: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China.
  8. Jingfa Xiao: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China.
  9. Aftab Ahmed Khan: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China.
  10. Fei Chen: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China.
  11. Zhang Zhang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China junyu@big.ac.cn zhangzhang@big.ac.cn.
  12. Jun Yu: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China junyu@big.ac.cn zhangzhang@big.ac.cn.

Abstract

The prokaryotic pangenome partitions genes into core and dispensable genes. The order of core genes, albeit assumed to be stable under selection in general, is frequently interrupted by horizontal gene transfer and rearrangement, but how a core-gene-defined genome maintains its stability or flexibility remains to be investigated. Based on data from 30 species, including 425 genomes from six phyla, we grouped core genes into syntenic blocks in the context of a pangenome according to their stability across multiple isolates. A subset of the core genes, often species specific and lineage associated, formed a core-gene-defined genome organizational framework (cGOF). Such cGOFs are either single segmental (one-third of the species analyzed) or multisegmental (the rest). Multisegment cGOFs were further classified into symmetric or asymmetric according to segment orientations toward the origin-terminus axis. The cGOFs in Gram-positive species are exclusively symmetric and often reversible in orientation, as opposed to those of the Gram-negative bacteria, which are all asymmetric and irreversible. Meanwhile, all species showing strong strand-biased gene distribution contain symmetric cGOFs and often specific DnaE (α subunit of DNA polymerase III) isoforms. Furthermore, functional evaluations revealed that cGOF genes are hub associated with regard to cellular activities, and the stability of cGOF provides efficient indexes for scaffold orientation as demonstrated by assembling virtual and empirical genome drafts. cGOFs show species specificity, and the symmetry of multisegmental cGOFs is conserved among taxa and constrained by DNA polymerase-centric strand-biased gene distribution. The definition of species-specific cGOFs provides powerful guidance for genome assembly and other structure-based analysis.
IMPORTANCE: Prokaryotic genomes are frequently interrupted by horizontal gene transfer (HGT) and rearrangement. To know whether there is a set of genes not only conserved in position among isolates but also functionally essential for a given species and to further evaluate the stability or flexibility of such genome structures across lineages are of importance. Based on a large number of multi-isolate pangenomic data, our analysis reveals that a subset of core genes is organized into a core-gene-defined genome organizational framework, or cGOF. Furthermore, the lineage-associated cGOFs among Gram-positive and Gram-negative bacteria behave differently: the former, composed of 2 to 4 segments, have their fragments symmetrically rearranged around the origin-terminus axis, whereas the latter show more complex segmentation and are partitioned asymmetrically into chromosomal structures. The definition of cGOFs provides new insights into prokaryotic genome organization and efficient guidance for genome assembly and analysis.

References

  1. Methods Mol Biol. 2009;532:367-77 [PMID: 19271196]
  2. Microbiol Mol Biol Rev. 2014 Mar;78(1):1-39 [PMID: 24600039]
  3. Genome Biol. 2000;1(6):RESEARCH0011 [PMID: 11178265]
  4. Trends Microbiol. 2002 Sep;10(9):393-5 [PMID: 12217498]
  5. Nucleic Acids Res. 2003 Nov 15;31(22):6570-7 [PMID: 14602916]
  6. Mol Microbiol. 2004 Jan;51(2):511-22 [PMID: 14756790]
  7. Nat Rev Microbiol. 2004 Jun;2(6):483-95 [PMID: 15152204]
  8. Microbiology. 2004 Jun;150(Pt 6):1609-27 [PMID: 15184548]
  9. Nat Genet. 2004 Jul;36(7):760-6 [PMID: 15208628]
  10. Nat Genet. 2005 Dec;37(12):1372-5 [PMID: 16311593]
  11. Science. 2006 Mar 3;311(5765):1283-7 [PMID: 16513982]
  12. Genomics Proteomics Bioinformatics. 2006 Nov;4(4):203-11 [PMID: 17531796]
  13. Science. 2007 Nov 30;318(5855):1449-52 [PMID: 17947550]
  14. PLoS Genet. 2007 Dec;3(12):e226 [PMID: 18085828]
  15. Proc Natl Acad Sci U S A. 2008 May 13;105(19):6976-81 [PMID: 18460604]
  16. PLoS Genet. 2008;4(7):e1000128 [PMID: 18650965]
  17. PLoS Genet. 2009 Jan;5(1):e1000344 [PMID: 19165319]
  18. Proc Natl Acad Sci U S A. 2005 Sep 27;102(39):13950-5 [PMID: 16172379]
  19. Nature. 2010 Jul 1;466(7302):77-81 [PMID: 20562858]
  20. Cold Spring Harb Perspect Biol. 2010 Sep;2(9):a003483 [PMID: 20534711]
  21. Int Microbiol. 2010 Jun;13(2):45-57 [PMID: 20890839]
  22. Philos Trans R Soc Lond B Biol Sci. 2011 Oct 27;366(1580):2942-8 [PMID: 21930586]
  23. Proc Natl Acad Sci U S A. 2011 Dec 13;108(50):20154-9 [PMID: 22128332]
  24. Proc Natl Acad Sci U S A. 2012 Jan 10;109(2):E42-50 [PMID: 22184251]
  25. Bioinformatics. 2012 Feb 1;28(3):416-8 [PMID: 22130594]
  26. Biol Direct. 2012;7:2 [PMID: 22230424]
  27. BMC Bioinformatics. 2012;13:43 [PMID: 22435713]
  28. Proc Natl Acad Sci U S A. 2012 Jun 19;109(25):E1647-56 [PMID: 22645353]
  29. Mol Syst Biol. 2012;8:610 [PMID: 22968444]
  30. Genome Biol Evol. 2013;5(5):783-93 [PMID: 23542079]
  31. Nat Methods. 2013 Jun;10(6):563-9 [PMID: 23644548]
  32. BMC Genomics. 2013;14:309 [PMID: 23651581]
  33. BMC Genomics. 2013;14:529 [PMID: 23915186]
  34. Nucleic Acids Res. 2014 Feb;42(4):2391-404 [PMID: 24243847]

MeSH Term

Archaea
Bacteria
Computational Biology
Gene Rearrangement
Genes, Essential
Genome, Archaeal
Genome, Bacterial
Genomic Instability
Genomic Structural Variation
Synteny

Word Cloud

Similar Articles

Cited By