Benchmarking differential abundance analysis methods for correlated microbiome sequencing data.

Lu Yang, Jun Chen
Author Information
  1. Lu Yang: Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55901, USA. ORCID
  2. Jun Chen: Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55901, USA.

Abstract

Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Current microbiome studies frequently generate correlated samples from different microbiome sampling schemes such as spatial and temporal sampling. In the past decade, a number of DAA tools for correlated microbiome data (DAA-c) have been proposed. Disturbingly, different DAA-c tools could sometimes produce quite discordant results. To recommend the best practice to the field, we performed the first comprehensive evaluation of existing DAA-c tools using real data-based simulations. Overall, the linear model-based methods LinDA, MaAsLin2 and LDM are more robust than methods based on generalized linear models. The LinDA method is the only method that maintains reasonable performance in the presence of strong compositional effects.

Keywords

References

  1. Nat Rev Genet. 2011 Dec 16;13(1):47-58 [PMID: 22179717]
  2. Mult Scler. 2020 Jun 26;:1352458520924594 [PMID: 33115343]
  3. Gut Microbes. 2021 Jan-Dec;13(1):1874815 [PMID: 33567985]
  4. Nat Microbiol. 2022 May;7(5):680-694 [PMID: 35484230]
  5. Genome Biol. 2010;11(3):R25 [PMID: 20196867]
  6. Genome Biol. 2022 Apr 14;23(1):95 [PMID: 35421994]
  7. Biostatistics. 2013 Apr;14(2):244-58 [PMID: 23074263]
  8. mSystems. 2020 Dec 8;5(6): [PMID: 33293403]
  9. mSystems. 2018 Nov 20;3(6): [PMID: 30505944]
  10. PLoS One. 2010 Dec 20;5(12):e15216 [PMID: 21188149]
  11. Ecology. 2010 Feb;91(2):610-20 [PMID: 20392025]
  12. Bioinformatics. 2020 Aug 15;36(14):4106-4115 [PMID: 32315393]
  13. Ecology. 2011 Jan;92(1):3-10 [PMID: 21560670]
  14. PeerJ. 2018 Apr 2;6:e4600 [PMID: 29629248]
  15. Genome Biol. 2015 Apr 08;16:67 [PMID: 25887922]
  16. Nature. 2019 May;569(7758):641-648 [PMID: 31142853]
  17. Nat Rev Microbiol. 2012 Jul 16;10(8):538-50 [PMID: 22796884]
  18. Comput Struct Biotechnol J. 2020 Sep 28;18:2789-2798 [PMID: 33101615]
  19. Nat Commun. 2019 Jun 20;10(1):2719 [PMID: 31222023]
  20. Microbiome. 2017 Mar 3;5(1):27 [PMID: 28253908]
  21. Nat Commun. 2021 Nov 18;12(1):6740 [PMID: 34795283]
  22. PLoS One. 2020 Nov 9;15(11):e0242073 [PMID: 33166356]
  23. Nat Commun. 2015 Mar 04;6:6440 [PMID: 25737238]
  24. Front Microbiol. 2017 Nov 07;8:2114 [PMID: 29163406]
  25. PLoS Comput Biol. 2021 Nov 16;17(11):e1009442 [PMID: 34784344]
  26. ISME J. 2017 Dec;11(12):2639-2643 [PMID: 28731476]
  27. mSystems. 2018 May 15;3(3): [PMID: 29795809]
  28. Front Microbiol. 2018 Jun 27;9:1391 [PMID: 29997602]
  29. Front Microbiol. 2017 Nov 15;8:2224 [PMID: 29187837]
  30. Genome Med. 2016 Nov 25;8(1):122 [PMID: 27884207]
  31. mSystems. 2018 Jan 30;3(1): [PMID: 29404425]
  32. Brief Bioinform. 2019 Jan 18;20(1):210-221 [PMID: 28968702]
  33. Microbiome. 2021 Jun 9;9(1):133 [PMID: 34108046]
  34. Gastroenterology. 2020 May;158(6):1584-1596 [PMID: 31958431]
  35. Nature. 2018 Oct;562(7728):589-594 [PMID: 30356183]
  36. Nature. 2018 Oct;562(7728):583-588 [PMID: 30356187]
  37. Cell. 2014 Nov 6;159(4):789-99 [PMID: 25417156]
  38. PLoS One. 2012;7(12):e52078 [PMID: 23284876]
  39. Nat Rev Genet. 2012 Mar 13;13(4):260-70 [PMID: 22411464]
  40. Nat Methods. 2018 Oct;15(10):796-798 [PMID: 30275573]
  41. Microbiome. 2022 Aug 19;10(1):130 [PMID: 35986393]
  42. Nat Microbiol. 2017 Feb 13;2:17004 [PMID: 28191884]
  43. J Clin Microbiol. 2013 Aug;51(8):2617-24 [PMID: 23740726]
  44. Nat Methods. 2013 Dec;10(12):1200-2 [PMID: 24076764]
  45. ISME J. 2020 Sep;14(9):2223-2235 [PMID: 32444812]
  46. Proc Natl Acad Sci U S A. 2010 Nov 2;107(44):18933-8 [PMID: 20937875]
  47. Nature. 2020 Nov;587(7834):448-454 [PMID: 33149306]
  48. Gastroenterology. 2010 Dec;139(6):1844-1854.e1 [PMID: 20816835]
  49. Microbiome. 2019 Apr 2;7(1):54 [PMID: 30940197]
  50. NPJ Biofilms Microbiomes. 2020 Dec 2;6(1):60 [PMID: 33268781]
  51. BMC Bioinformatics. 2017 Jan 3;18(1):4 [PMID: 28049409]
  52. PLoS Comput Biol. 2014 Apr 03;10(4):e1003531 [PMID: 24699258]
  53. Ann Appl Stat. 2013 Mar 1;7(1): [PMID: 24312162]
  54. Bioinformatics. 2016 Sep 1;32(17):2611-7 [PMID: 27187200]
  55. Bioinformatics. 2020 Apr 15;36(8):2345-2351 [PMID: 31904815]
  56. Curr Opin Microbiol. 2015 Jun;25:56-66 [PMID: 26005845]
  57. PLoS One. 2020 Apr 30;15(4):e0224909 [PMID: 32352970]
  58. Bioinformatics. 2018 Feb 15;34(4):643-651 [PMID: 29040451]
  59. Genome Biol. 2022 Mar 15;23(1):79 [PMID: 35292087]

Grants

  1. R01 GM144351/NIGMS NIH HHS
  2. R21 HG011662/NHGRI NIH HHS
  3. R01 GM144351/NIH HHS

MeSH Term

Microbiota
Linear Models
Databases, Factual
Metagenomics

Word Cloud

Created with Highcharts 10.0.0microbiomeanalysisabundanceDAAdatacorrelatedsamplingtoolsDAA-cmethodsrobustdifferentlinearLinDAmethoddifferentialDifferentialonecentralstatisticaltaskpowerfultoolcanhelpidentifyhighlyconfidentmicrobialcandidatesbiologicalvalidationCurrentstudiesfrequentlygeneratesamplesschemesspatialtemporalpastdecadenumberproposedDisturbinglysometimesproducequitediscordantresultsrecommendbestpracticefieldperformedfirstcomprehensiveevaluationexistingusingrealdata-basedsimulationsOverallmodel-basedMaAsLin2LDMbasedgeneralizedmodelsmaintainsreasonableperformancepresencestrongcompositionaleffectsBenchmarkingsequencinglongitudinalmatched-pairmetagenomicsrepeated

Similar Articles

Cited By