Bayesian compositional regression with structured priors for microbiome feature selection.

Liangliang Zhang, Yushu Shi, Robert R Jenq, Kim-Anh Do, Christine B Peterson
Author Information
  1. Liangliang Zhang: Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas. ORCID
  2. Yushu Shi: Department of Statistics, University of Missouri, Columbia, Missouri. ORCID
  3. Robert R Jenq: Department of Genomic Medicine, University of Texas MD Anderson Cancer Center, Houston, Texas. ORCID
  4. Kim-Anh Do: Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas.
  5. Christine B Peterson: Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas. ORCID

Abstract

The microbiome plays a critical role in human health and disease, and there is a strong scientific interest in linking specific features of the microbiome to clinical outcomes. There are key aspects of microbiome data, however, that limit the applicability of standard variable selection methods. In particular, the observed data are compositional, as the counts within each sample have a fixed-sum constraint. In addition, microbiome features, typically quantified as operational taxonomic units, often reflect microorganisms that are similar in function, and may therefore have a similar influence on the response variable. To address the challenges posed by these aspects of the data structure, we propose a variable selection technique with the following novel features: a generalized transformation and z-prior to handle the compositional constraint, and an Ising prior that encourages the joint selection of microbiome features that are closely related in terms of their genetic sequence similarity. We demonstrate that our proposed method outperforms existing penalized approaches for microbiome variable selection in both simulation and the analysis of real data exploring the relationship of the gut microbiome to body mass index.

Keywords

References

Nat Microbiol. 2018 Jun;3(6):652-661 [PMID: 29795540]
NPJ Biofilms Microbiomes. 2016 Apr 20;2:16004 [PMID: 28721243]
Curr Nutr Rep. 2019 Dec;8(4):307-316 [PMID: 31161579]
Appl Environ Microbiol. 2009 Dec;75(23):7537-41 [PMID: 19801464]
Front Microbiol. 2018 Jun 27;9:1391 [PMID: 29997602]
Biometrics. 2019 Mar;75(1):235-244 [PMID: 30039859]
Obesity (Silver Spring). 2013 Dec;21(12):E607-15 [PMID: 23526699]
Nutr Rev. 2012 Aug;70 Suppl 1:S38-44 [PMID: 22861806]
Nature. 2007 Oct 18;449(7164):804-10 [PMID: 17943116]
Nat Rev Gastroenterol Hepatol. 2018 Nov;15(11):671-682 [PMID: 29844585]
J Clin Biochem Nutr. 2016 Jul;59(1):65-70 [PMID: 27499582]
Front Microbiol. 2017 Nov 15;8:2224 [PMID: 29187837]
Cancer Cell. 2018 Apr 9;33(4):570-580 [PMID: 29634945]
BMC Bioinformatics. 2017 Feb 8;18(1):94 [PMID: 28178947]
Science. 2011 Oct 7;334(6052):105-8 [PMID: 21885731]
Syst Biol. 2011 Dec;60(6):826-32 [PMID: 21804094]
Science. 2005 Mar 25;307(5717):1915-20 [PMID: 15790844]
Nature. 2012 Oct 4;490(7418):55-60 [PMID: 23023125]
Nat Commun. 2017 Oct 10;8(1):845 [PMID: 29018189]
Appl Environ Microbiol. 2007 Jan;73(1):278-88 [PMID: 17071787]

Grants

  1. P50 CA140388/NCI NIH HHS
  2. R01 HL124112/NHLBI NIH HHS
  3. P30 CA016672/NCI NIH HHS
  4. UL1 TR000371/NCATS NIH HHS

MeSH Term

Bayes Theorem
Computer Simulation
Gastrointestinal Microbiome
Humans
Microbiota