Interactive Web-Based Services for Metagenomic Data Analysis and Comparisons.

Nehal Adel Abdelsalam, Hajar Elshora, Mohamed El-Hadidi
Author Information
  1. Nehal Adel Abdelsalam: University of Science and Technology, Zewail City, Giza, Egypt.
  2. Hajar Elshora: Bioinformatics Group, Center for Informatics Sciences (CIS), Nile University, Giza, Egypt.
  3. Mohamed El-Hadidi: Bioinformatics Group, Center for Informatics Sciences (CIS), Nile University, Giza, Egypt. melhadidi@nu.edu.eg.

Abstract

Recently, sequencing technologies have become readily available, and scientists are more motivated to conduct metagenomic research to unveil the potential of a myriad of ecosystems and biomes. Metagenomics studies the composition and functions of microbial communities and paves the way to multiple applications in medicine, industry, and ecology. Nonetheless, the immense amount of sequencing data of metagenomics research and the few user-friendly analysis tools and pipelines carry a new challenge to the data analysis.Web-based bioinformatics tools are now being developed to facilitate the analysis of complex metagenomic data without prior knowledge of any programming languages or special installation. Specialized web tools help answer researchers' main questions on the taxonomic classification, functional capabilities, discrepancies between two ecosystems, and the probable functional correlations between the members of a specific microbial community. With an Internet connection and a few clicks, researchers can conveniently and efficiently analyze the metagenomic datasets, summarize results, and visualize key information on the composition and the functional potential of metagenomic samples under study. This chapter provides a simple guide to a few of the fundamental web-based services used for metagenomic data analyses, such as BV-BRC, RDP, MG-RAST, MicrobiomeAnalyst, METAGENassist, and MGnify.

Keywords

References

  1. Liebl W (2011) Metagenomics. In: Reitner J, Thiel V (eds) Encyclopedia of geobiology. Encyclopedia of earth sciences series. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-9212-1_133 [DOI: 10.1007/978-1-4020-9212-1_133]
  2. Poria V, Singh S, Nain L, Singh B, Saini JK (2021) Rhizospheric microbial communities: occurrence, distribution, and functions. In: Nath M, Bhatt D, Bhargava P, Choudhary DK (eds) Microbial metatranscriptomics belowground. Springer, Singapore. https://doi.org/10.1007/978-981-15-9758-9_12 [DOI: 10.1007/978-981-15-9758-9_12]
  3. Datta S, Rajnish KN, Samuel MS et al (2020) Metagenomic applications in microbial diversity, bioremediation, pollution monitoring, enzyme and drug discovery. A review. Environ Chem Lett 18:1229–1241. https://doi.org/10.1007/s10311-020-01010-z [DOI: 10.1007/s10311-020-01010-z]
  4. Latorre-Pérez A, Pascual J, Porcar M, Vilanova C (2020) A lab in the field: applications of real-time, in situ metagenomic sequencing. Biology Method Protoc 5(1):bpaa016. https://doi.org/10.1093/biomethods/bpaa016 [DOI: 10.1093/biomethods/bpaa016]
  5. Xie G, Zhao B, Wang X et al (2021) Exploring the clinical utility of metagenomic next-generation sequencing in the diagnosis of pulmonary infection. Infect Dis Ther 10:1419. https://doi.org/10.1007/s40121-021-00476-w [DOI: 10.1007/s40121-021-00476-w]
  6. Wilke J, Ramchandar N, Cannavino C et al (2021) Clinical application of cell-free next-generation sequencing for infectious diseases at a tertiary children’s hospital. BMC Infect Dis 21:1–6. https://doi.org/10.1186/s12879-021-06292-4 [DOI: 10.1186/s12879-021-06292-4]
  7. Olson RD, Assaf R, Brettin T, Conrad N, Cucinell C, Davis JJ, Dempsey DM et al (2022) Introducing the bacterial and viral bioinformatics resource center (BV-BRC): a resource combining PATRIC, IRD and ViPR. Nucleic Acids Res 51:D678. https://doi.org/10.1093/nar/gkac1003 [DOI: 10.1093/nar/gkac1003]
  8. Wattam AR, Davis JJ, Assaf R, Boisvert S, Brettin T, Bun C et al (2017) Improvements to PATRIC, the all-bacterial bioinformatics database and analysis resource center. Nucleic Acids Res 45(D1):D535–D542. https://doi.org/10.1093/nar/gkw1017 [DOI: 10.1093/nar/gkw1017]
  9. Wang Q, Garrity GM, Tiedje JM, Cole JR (2007) Naïve bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol 73(16):5261–5267. https://doi.org/10.1128/AEM.00062-07 [DOI: 10.1128/AEM.00062-07]
  10. Schloss PD, Westcott SL, Ryabin T, Hall JR et al (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75(23):7537–7541. https://doi.org/10.1128/AEM.01541-09 [DOI: 10.1128/AEM.01541-09]
  11. Menzel P, Ng KL, Krogh A (2016) Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nat Commun 7(1):1–9. https://doi.org/10.1038/ncomms11257 [DOI: 10.1038/ncomms11257]
  12. Patil KR, Roune L, McHardy AC (2012) The PhyloPythiaS web server for taxonomic assignment of metagenome sequences. PLoS One 7(6):e38581. https://doi.org/10.1371/journal.pone.0038581 [DOI: 10.1371/journal.pone.0038581]
  13. Wood DE, Lu J, Langmead B (2019) Improved metagenomic analysis with Kraken 2. Genome Biol 20:257. https://doi.org/10.1186/s13059-019-1891-0 [DOI: 10.1186/s13059-019-1891-0]
  14. Leinonen R, Sugawara H, Shumway M, International Nucleotide Sequence Database Collaboration (2010) The sequence read archive. Nucleic Acids Res 39(suppl_1):D19–D21. https://doi.org/10.1093/nar/gkq1019 [DOI: 10.1093/nar/gkq1019]
  15. Ondov BD, Bergman NH, Phillippy AM (2011) Interactive metagenomic visualization in a Web browser. BMC Bioinform 12(1):385. https://doi.org/10.1186/1471-2105-12-385 [DOI: 10.1186/1471-2105-12-385]
  16. Parrello B, Butler R, Chlenski P et al (2021) Supervised extraction of near-complete genomes from metagenomic samples: a new service in BV-BRC. PLoS One 16(4):e0250092. https://doi.org/10.1371/journal.pone.0250092 [DOI: 10.1371/journal.pone.0250092]
  17. Ortiz-Burgos S (2016) Shannon-weaver diversity index. In: Kennish MJ (ed) Encyclopedia of estuaries. Encyclopedia of earth sciences series. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-8801-4_233 [DOI: 10.1007/978-94-017-8801-4_233]
  18. Baselga A, Leprieur F (2015) Comparing methods to separate components of beta diversity. Methods Ecol Evol 6:1069–1079. https://doi.org/10.1111/2041-210X.12388 [DOI: 10.1111/2041-210X.12388]
  19. Afgan E, Baker D, Batut B et al (2018) The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res 46(W1):W537–W544. https://doi.org/10.1093/nar/gky379 [DOI: 10.1093/nar/gky379]
  20. Hiltemann S, Batut B, Clements D (2019) 16S microbial analysis with mothur (short) (Galaxy training materials). https://training.galaxyproject.org/training-material/topics/metagenomics/tutorials/mothur-miseq-sop-short/tutorial.html . Accessed 15 Jun 2021
  21. Batut B, Hiltemann S, Bagnacani A et al (2018) Community-driven data analysis training for biology. Cell Syst 6(6):752–758. https://doi.org/10.1016/j.cels.2018.05.012 [DOI: 10.1016/j.cels.2018.05.012]
  22. Rosenbloom KR, Armstrong J, Barber GP et al (2015) The UCSC genome browser database: 2015 update. Nucleic Acids Res 43(D1):D670–D681. https://doi.org/10.1093/nar/gku1177 [DOI: 10.1093/nar/gku1177]
  23. Quast C, Pruesse E, Yilmaz P et al (2012) The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res 41(D1):D590–D596. https://doi.org/10.1093/nar/gks1219 [DOI: 10.1093/nar/gks1219]
  24. DeSantis TZ, Hugenholtz P, Larsen N et al (2006) Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol 72(7):5069–5072. https://doi.org/10.1128/AEM.03006-05 [DOI: 10.1128/AEM.03006-05]
  25. National Center for Biotechnology Information (NCBI) [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; [1988] Available via: https://www.ncbi.nlm.nih.gov/ . Accessed 15 Jun 2021
  26. Rognes T, Flouri T, Nichols B et al (2016) VSEARCH: a versatile open source tool for metagenomics. PeerJ 4:e2584. https://doi.org/10.7717/peerj.2584 . eCollection 2016 [DOI: 10.7717/peerj.2584]
  27. Finotello F, Mastrorilli E, Di Camillo B (2018) Measuring the diversity of the human microbiota with targeted next-generation sequencing. Brief Bioinform 19(4):679–692. https://doi.org/10.1093/bib/bbw119 [DOI: 10.1093/bib/bbw119]
  28. Lam KN, Cheng J, Engel K et al (2015) Current and future resources for functional metagenomics. Front Microbiol 6:1196. https://doi.org/10.3389/fmicb.2015.01196 [DOI: 10.3389/fmicb.2015.01196]
  29. Meyer F, Paarmann D, D'Souza M et al (2008) The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatic 9:386. https://doi.org/10.1186/1471-2105-9-386 [DOI: 10.1186/1471-2105-9-386]
  30. Bischof J, Harrison T, Paczian T et al (2014) Metazen – metadata capture for metagenomes. Stand Genomic Sci 9:18. https://doi.org/10.1186/1944-3277-9-18 [DOI: 10.1186/1944-3277-9-18]
  31. O’Leary NA, Wright MW, Brister JR et al (2016) Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 44(D1):D733–D745. https://doi.org/10.1093/nar/gkv1189 [DOI: 10.1093/nar/gkv1189]
  32. Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2016) GenBank. N Nucleic Acids Res 44(D1):D67–D72. https://doi.org/10.1093/nar/gkv1276 [DOI: 10.1093/nar/gkv1276]
  33. Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30. https://doi.org/10.1093/nar/28.1.27 [DOI: 10.1093/nar/28.1.27]
  34. Wu S, Zhu Z, Fu L et al (2011) WebMGA: a customizable web server for fast metagenomic sequence analysis. BMC Genomics 12:444. https://doi.org/10.1186/1471-2164-12-444 [DOI: 10.1186/1471-2164-12-444]
  35. Noguchi H, Park J, Takagi T (2006) MetaGene: prokaryotic gene finding from environmental genome shotgun sequences. Nucleic Acids Res 34(19):5623–5630. https://doi.org/10.1093/nar/gkl723 [DOI: 10.1093/nar/gkl723]
  36. Tatusov RL, Galperin MY, Natale DA, Koonin EV (2000) The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res 28(1):33–36. https://doi.org/10.1093/nar/28.1.33 [DOI: 10.1093/nar/28.1.33]
  37. Finn RD, Bateman A, Clements J et al (2014) Pfam: the protein families database. Nucleic Acids Res 42(Database issue):D222–D230. https://doi.org/10.1093/nar/gkt1223 [DOI: 10.1093/nar/gkt1223]
  38. Chong J, Liu P, Zhou G, Xia J (2020) Using MicrobiomeAnalyst for comprehensive statistical, functional, and meta-analysis of microbiome data. Nat Protoc 15(3):799–821. https://doi.org/10.1038/s41596-019-0264-1 [DOI: 10.1038/s41596-019-0264-1]
  39. Dhariwal A, Chong J, Habib S et al (2017) MicrobiomeAnalyst – a web-based tool for comprehensive statistical, visual and meta-analysis of microbiome data. Nucleic Acids Res 45:W180–W188. https://doi.org/10.1093/nar/gkx295 [DOI: 10.1093/nar/gkx295]
  40. Devlin JC, Battaglia T, Blaser MJ et al (2018) WHAM!: a web-based visualization suite for user-defined analysis of metagenomic shotgun sequencing data. BMC Genomics 19(1):1–11. https://doi.org/10.1186/s12864-018-4870-z [DOI: 10.1186/s12864-018-4870-z]
  41. Arndt D, Xia J, Liu Y et al (2012) METAGENassist: a comprehensive web server for comparative metagenomics. Nucleic Acids Res 40(Web Server issue):W88–W95. https://doi.org/10.1093/nar/gks497 [DOI: 10.1093/nar/gks497]
  42. Hunter S, Corbett M, Denise H et al (2014) EBI metagenomics – a new resource for the analysis and archiving of metagenomic data. Nucleic Acids Res 42(D1):D600–D606. https://doi.org/10.1093/nar/gkt961 [DOI: 10.1093/nar/gkt961]
  43. Mitchell AL, Almeida A, Beracochea M et al (2019) MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res 48(D1):D570–D578. https://doi.org/10.1093/nar/gkz1035 [DOI: 10.1093/nar/gkz1035]
  44. Ten Hoopen P, Finn RD, Bongo LA et al (2017) The metagenomic data life-cycle: standards and best practices. GigaScience 6(8):1–11. https://doi.org/10.1093/gigascience/gix047 [DOI: 10.1093/gigascience/gix047]
  45. John JS (n.d.) SeqPrep. Available via https://github.com/jstjohn/SeqPrep
  46. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120. https://doi.org/10.1093/bioinformatics/btu170 [DOI: 10.1093/bioinformatics/btu170]
  47. Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR (2003) Rfam: an RNA family database. Nucleic Acids Res 31(1):439–441. https://doi.org/10.1093/nar/gkg006 [DOI: 10.1093/nar/gkg006]
  48. Santamaria M, Fosso B, Licciulli F et al (2018) ITSoneDB: a comprehensive collection of eukaryotic ribosomal RNA Internal Transcribed Spacer 1 (ITS1) sequences. Nucleic Acids Res 46(D1):D127–D132. https://doi.org/10.1093/nar/gkx855 [DOI: 10.1093/nar/gkx855]
  49. Nilsson RH, Larsson KH, Taylor AFS et al (2019) The UNITE database for molecular identification of fungi: handling dark taxa and parallel taxonomic classifications. Nucleic Acids Res 47(D1):D259–D264. https://doi.org/10.1093/nar/gky1022 [DOI: 10.1093/nar/gky1022]
  50. Milanese A, Mende DR, Paoli L et al (2019) Microbial abundance, activity and population genomic profiling with mOTUs2. Nat Commun 10(1):1–11. https://doi.org/10.1038/s41467-019-08844-4 [DOI: 10.1038/s41467-019-08844-4]
  51. Kanehisa M, Sato Y (2020) KEGG mapper for inferring cellular functions from protein sequences. Protein Sci 29(1):28–35. https://doi.org/10.1002/pro.3711 [DOI: 10.1002/pro.3711]
  52. McHardy AC, Martín HG, Tsirigos A et al (2007) Accurate phylogenetic classification of variable-length DNA fragments. Nat Methods 4(1):63–72. https://doi.org/10.1038/nmeth976 [DOI: 10.1038/nmeth976]

MeSH Term

Metagenomics
Metagenome
Microbiota
Ecology
Computational Biology
Data Analysis

Word Cloud

Created with Highcharts 10.0.0metagenomicdatatoolssequencinganalysisfunctionalresearchpotentialecosystemsMetagenomicscompositionmicrobialmetagenomicsRecentlytechnologiesbecomereadilyavailablescientistsmotivatedconductunveilmyriadbiomesstudiesfunctionscommunitiespaveswaymultipleapplicationsmedicineindustryecologyNonethelessimmenseamountuser-friendlypipelinescarrynewchallengeWeb-basedbioinformaticsnowdevelopedfacilitatecomplexwithoutpriorknowledgeprogramminglanguagesspecialinstallationSpecializedwebhelpanswerresearchers'mainquestionstaxonomicclassificationcapabilitiesdiscrepanciestwoprobablecorrelationsmembersspecificcommunityInternetconnectionclicksresearcherscanconvenientlyefficientlyanalyzedatasetssummarizeresultsvisualizekeyinformationsamplesstudychapterprovidessimpleguidefundamentalweb-basedservicesusedanalysesBV-BRCRDPMG-RASTMicrobiomeAnalystMETAGENassistMGnifyInteractiveWeb-BasedServicesMetagenomicDataAnalysisComparisonsFunctionalShotgunTaxonomyWeb

Similar Articles

Cited By