BZINB Model-Based Pathway Analysis and Module Identification Facilitates Integration of Microbiome and Metabolome Data.

Bridget M Lin, Hunyong Cho, Chuwen Liu, Jeff Roach, Apoena Aguiar Ribeiro, Kimon Divaris, Di Wu
Author Information
  1. Bridget M Lin: Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
  2. Hunyong Cho: Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
  3. Chuwen Liu: Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
  4. Jeff Roach: Research Computing, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
  5. Apoena Aguiar Ribeiro: Division of Diagnostic Sciences, Adams School of Dentistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA. ORCID
  6. Kimon Divaris: Division of Pediatric and Public Health, Adams School of Dentistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA. ORCID
  7. Di Wu: Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA. ORCID

Abstract

Integration of multi-omics data is a challenging but necessary step to advance our understanding of the biology underlying human health and disease processes. To date, investigations seeking to integrate multi-omics (e.g., microbiome and metabolome) employ simple correlation-based network analyses; however, these methods are not always well-suited for microbiome analyses because they do not accommodate the excess zeros typically present in these data. In this paper, we introduce a bivariate zero-inflated negative binomial (BZINB) model-based network and module analysis method that addresses this limitation and improves microbiome-metabolome correlation-based model fitting by accommodating excess zeros. We use real and simulated data based on a multi-omics study of childhood oral health (ZOE 2.0; investigating early childhood dental caries, ECC) and find that the accuracy of the BZINB model-based correlation method is superior compared to Spearman's rank and Pearson correlations in terms of approximating the underlying relationships between microbial taxa and metabolites. The new method, BZINB-iMMPath, facilitates the construction of metabolite-species and species-species correlation networks using BZINB and identifies modules of (i.e., correlated) species by combining BZINB and similarity-based clustering. Perturbations in correlation networks and modules can be efficiently tested between groups (i.e., healthy and diseased study participants). Upon application of the new method in the ZOE 2.0 study microbiome-metabolome data, we identify that several biologically-relevant correlations of ECC-associated microbial taxa with carbohydrate metabolites differ between healthy and dental caries-affected participants. In sum, we find that the BZINB model is a useful alternative to Spearman or Pearson correlations for estimating the underlying correlation of zero-inflated bivariate count data and thus is suitable for integrative analyses of multi-omics data such as those encountered in microbiome and metabolome studies.

Keywords

References

  1. Bioinformatics. 2020 Feb 15;36(4):1159-1166 [PMID: 31501851]
  2. J Bacteriol. 2015 Apr 27;197(3):2104-2111 [PMID: 25917902]
  3. Front Cell Infect Microbiol. 2021 Oct 25;11:734416 [PMID: 34760716]
  4. Caries Res. 2013;47(2):89-102 [PMID: 23207320]
  5. J Dent Res. 2015 Dec;94(12):1628-37 [PMID: 26377570]
  6. Stat Methods Med Res. 2023 Jul;32(7):1300-1317 [PMID: 37167422]
  7. Comput Struct Biotechnol J. 2020 Sep 10;18:2583-2595 [PMID: 33033579]
  8. Brief Bioinform. 2023 Sep 20;24(5): [PMID: 37738402]
  9. J Neurochem. 1995 Apr;64(4):1734-41 [PMID: 7891102]
  10. Genome Biol. 2019 Nov 28;20(1):257 [PMID: 31779668]
  11. Nucleic Acids Res. 2012 Sep 1;40(17):e133 [PMID: 22638577]
  12. J Clin Periodontol. 2017 Mar;44 Suppl 18:S23-S38 [PMID: 28266108]
  13. Int J Environ Res Public Health. 2020 Nov 01;17(21): [PMID: 33139633]
  14. Front Cell Dev Biol. 2020 Oct 22;8:588041 [PMID: 33195248]
  15. Cell Rep Methods. 2021 Oct 25;1(6):100095 [PMID: 35474895]
  16. IUBMB Life. 2008 Sep;60(9):605-8 [PMID: 18506840]
  17. Anal Chem. 2009 Aug 15;81(16):6656-67 [PMID: 19624122]
  18. Nat Microbiol. 2019 Feb;4(2):293-305 [PMID: 30531976]
  19. Genet Epidemiol. 2021 Mar;45(2):142-153 [PMID: 32989764]
  20. Comput Biol Med. 2021 Nov;138:104933 [PMID: 34655897]
  21. Genome Res. 2003 Nov;13(11):2498-504 [PMID: 14597658]
  22. PLoS Comput Biol. 2021 Jun 18;17(6):e1009089 [PMID: 34143768]
  23. J Dent Res. 2021 Jun;100(6):615-622 [PMID: 33423574]
  24. Methods Mol Biol. 2019;1922:525-548 [PMID: 30838598]
  25. NPJ Syst Biol Appl. 2020 Jun 19;6(1):20 [PMID: 32561750]
  26. Mol Microbiol. 2007 Feb;63(3):872-80 [PMID: 17302806]
  27. Nat Commun. 2020 Mar 3;11(1):1169 [PMID: 32127540]
  28. Microb Cell. 2018 May 07;5(5):215-219 [PMID: 29796386]
  29. Methods Mol Biol. 2019;1922:511-523 [PMID: 30838597]
  30. Brief Bioinform. 2022 May 13;23(3): [PMID: 35325048]
  31. Nat Methods. 2018 Nov;15(11):962-968 [PMID: 30377376]
  32. J Bacteriol. 2010 Oct;192(19):5002-17 [PMID: 20656903]
  33. BMC Bioinformatics. 2008 Dec 29;9:559 [PMID: 19114008]

Grants

  1. P30 ES010126/NIEHS NIH HHS
  2. R03 DE028983/NIDCR NIH HHS
  3. U01 DE025046/NIDCR NIH HHS

Word Cloud

Created with Highcharts 10.0.0dataBZINBmulti-omicscorrelationmicrobiomemethodunderlyingenetworkanalysesstudycorrelationsIntegrationhealthmetabolomecorrelation-basedexcesszerosbivariatezero-inflatedmodel-basedmicrobiome-metabolomemodelchildhoodZOE20dentalcariesfindPearsonmicrobialtaxametabolitesnewnetworksmodulesiclusteringhealthyparticipantschallengingnecessarystepadvanceunderstandingbiologyhumandiseaseprocessesdateinvestigationsseekingintegrategemploysimplehowevermethodsalwayswell-suitedaccommodatetypicallypresentpaperintroducenegativebinomialmoduleanalysisaddresseslimitationimprovesfittingaccommodatinguserealsimulatedbasedoralinvestigatingearlyECCaccuracysuperiorcomparedSpearman'sranktermsapproximatingrelationshipsBZINB-iMMPathfacilitatesconstructionmetabolite-speciesspecies-speciesusingidentifiescorrelatedspeciescombiningsimilarity-basedPerturbationscanefficientlytestedgroupsdiseasedUponapplicationidentifyseveralbiologically-relevantECC-associatedcarbohydratediffercaries-affectedsumusefulalternativeSpearmanestimatingcountthussuitableintegrativeencounteredstudiesModel-BasedPathwayAnalysisModuleIdentificationFacilitatesMicrobiomeMetabolomeDatacountsmetabolomicspathwayszero-inflation

Similar Articles

Cited By