PGAP-X: extension on pan-genome analysis pipeline.

Yongbing Zhao, Chen Sun, Dongyu Zhao, Yadong Zhang, Yang You, Xinmiao Jia, Junhui Yang, Lingping Wang, Jinyue Wang, Haohuan Fu, Yu Kang, Fei Chen, Jun Yu, Jiayan Wu, Jingfa Xiao
Author Information
  1. Yongbing Zhao: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  2. Chen Sun: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  3. Dongyu Zhao: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  4. Yadong Zhang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  5. Yang You: Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, People's Republic of China.
  6. Xinmiao Jia: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  7. Junhui Yang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  8. Lingping Wang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  9. Jinyue Wang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  10. Haohuan Fu: Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, People's Republic of China.
  11. Yu Kang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  12. Fei Chen: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  13. Jun Yu: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
  14. Jiayan Wu: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China. wujy@big.ac.cn.
  15. Jingfa Xiao: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China. xiaojingfa@big.ac.cn.

Abstract

BACKGROUND: Since PGAP (pan-genome analysis pipeline) was published in 2012, it has been widely employed in bacterial genomics research. Though PGAP has integrated several modules for pan-genomics analysis, how to properly and effectively interpret and visualize the results data is still a challenge.
RESULT: To well present bacterial genomic characteristics, a novel cross-platform software was developed, named PGAP-X. Four kinds of data analysis modules were developed and integrated: whole genome sequences alignment, orthologous genes clustering, pan-genome profile analysis, and genetic variants analysis. The results from these analyses can be directly visualized in PGAP-X. The modules for data visualization in PGAP-X include: comparison of genome structure, gene distribution by conservation, pan-genome profile curve and variation on genic and genomic region. Meanwhile, result data produced by other programs with similar function can be imported to be further analyzed and visualized in PGAP-X. To test the performance of PGAP-X, we comprehensively analyzed 14 Streptococcus pneumonia strains and 14 Chlamydia trachomatis. The results show that, S. pneumonia strains have higher diversity on genome structure and gene contents than C. trachomatis strains. In addition, S. pneumonia strains might have suffered many evolutionary events, such genomic rearrangements, frequent horizontal gene transfer, homologous recombination, and other evolutionary process.
CONCLUSION: Briefly, PGAP-X directly presents the characteristics of bacterial genomic diversity with different visualization methods, which could help us to intuitively understand dynamics and evolution in bacterial genomes. The source code and the pre-complied executable programs are freely available from http://pgapx.ybzhao.com .

Keywords

References

  1. BMC Microbiol. 2011 Feb 01;11:25 [PMID: 21284853]
  2. Sci Rep. 2014 Feb 11;4:4061 [PMID: 24515248]
  3. Genome Res. 2012 May;22(5):908-24 [PMID: 22369888]
  4. BMC Genomics. 2013 Jun 28;14:430 [PMID: 23805886]
  5. Virus Genes. 2013 Dec;47(3):550-5 [PMID: 23912978]
  6. Proc Natl Acad Sci U S A. 2009 May 26;106(21):8605-10 [PMID: 19435847]
  7. PLoS One. 2012;7(9):e45346 [PMID: 23028950]
  8. Proc Natl Acad Sci U S A. 2002 Feb 19;99(4):2100-5 [PMID: 11854505]
  9. Appl Environ Microbiol. 2013 Jul;79(14):4304-15 [PMID: 23645200]
  10. PLoS Genet. 2013;9(9):e1003819 [PMID: 24086154]
  11. Brief Bioinform. 2011 Sep;12(5):379-91 [PMID: 21690100]
  12. Curr Opin Genet Dev. 2005 Dec;15(6):589-94 [PMID: 16185861]
  13. Bioinformatics. 2011 Sep 1;27(17):2429-30 [PMID: 21765097]
  14. Nucleic Acids Res. 2004 Mar 19;32(5):1792-7 [PMID: 15034147]
  15. Nucleic Acids Res. 2018 Jan 9;46(1):e5 [PMID: 29077859]
  16. PLoS One. 2010 Jun 25;5(6):e11147 [PMID: 20593022]
  17. Bioinformatics. 2012 Feb 1;28(3):416-8 [PMID: 22130594]
  18. BMC Bioinformatics. 2010 Sep 15;11:461 [PMID: 20843356]
  19. BMC Genomics. 2014 Jan 03;15:8 [PMID: 24387194]
  20. Bioinformatics. 2014 May 1;30(9):1297-9 [PMID: 24420766]
  21. Genome Biol. 2010;11(10):R107 [PMID: 21034474]
  22. PLoS One. 2017 May 24;12 (5):e0178154 [PMID: 28542514]
  23. Proc Natl Acad Sci U S A. 2005 Sep 27;102(39):13950-5 [PMID: 16172379]
  24. Bioinformatics. 2010 May 1;26(9):1256-7 [PMID: 20219865]
  25. Bioinformatics. 2010 Dec 15;26(24):3125-6 [PMID: 20956244]
  26. Genomics Proteomics Bioinformatics. 2015 Feb;13(1):73-6 [PMID: 25721608]
  27. PLoS Pathog. 2010 Sep 16;6(9):e1001108 [PMID: 20862314]
  28. Nature. 2013 Jul 11;499(7457):209-13 [PMID: 23760476]
  29. Nat Genet. 2011 Aug 28;43(10):956-63 [PMID: 21874002]

MeSH Term

Chlamydia trachomatis
Computer Graphics
Evolution, Molecular
Genetic Variation
Genome, Bacterial
High-Throughput Nucleotide Sequencing
Software
Streptococcus pneumoniae

Word Cloud

Created with Highcharts 10.0.0analysisPGAP-Xpan-genomebacterialdatagenomicstrainsmodulesresultsgenomevisualizationgenepneumoniaPGAPpipelinecharacteristicsdevelopedprofilecandirectlyvisualizedstructurevariationprogramsanalyzedtrachomatisSdiversityevolutionaryBACKGROUND:Sincepublished2012widelyemployedgenomicsresearchThoughintegratedseveralpan-genomicsproperlyeffectivelyinterpretvisualizestillchallengeRESULT:wellpresentnovelcross-platformsoftwarenamedFourkindsintegrated:wholesequencesalignmentorthologousgenesclusteringgeneticvariantsanalysesinclude:comparisondistributionconservationcurvegenicregionMeanwhileresultproducedsimilarfunctionimportedtestperformancecomprehensively14 Streptococcus14ChlamydiashowhighercontentsCadditionmightsufferedmanyeventsrearrangementsfrequenthorizontaltransferhomologousrecombinationprocessCONCLUSION:Brieflypresentsdifferentmethodshelpusintuitivelyunderstanddynamicsevolutiongenomessourcecodepre-compliedexecutablefreelyavailablehttp://pgapxybzhaocomPGAP-X:extensionGeneticGenomePan-genomics

Similar Articles

Cited By (16)