GMQN: A Reference-Based Method for Correcting Batch Effects and Probe Bias in HumanMethylation BeadChip.

Zhuang Xiong, Mengwei Li, Yingke Ma, Rujiao Li, Yiming Bao
Author Information
  1. Zhuang Xiong: National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation, Beijing, China.
  2. Mengwei Li: National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation, Beijing, China.
  3. Yingke Ma: National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation, Beijing, China.
  4. Rujiao Li: National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation, Beijing, China.
  5. Yiming Bao: National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation, Beijing, China.

Abstract

The Illumina HumanMethylation BeadChip is one of the most cost-effective methods to quantify DNA methylation levels at single-base resolution across the human genome, which makes it a routine platform for epigenome-wide association studies. It has accumulated tens of thousands of DNA methylation array samples in public databases, providing great support for data integration and further analysis. However, the majority of public DNA methylation data are deposited as processed data without background probes which are widely used in data normalization. Here, we present Gaussian mixture quantile normalization (GMQN), a reference based method for correcting batch effects as well as probe bias in the HumanMethylation BeadChip. Availability and implementation: https://github.com/MengweiLi-project/gmqn.

Keywords

References

  1. Bioinformatics. 2013 Jan 15;29(2):189-96 [PMID: 23175756]
  2. Nucleic Acids Res. 2016 Feb 18;44(3):e20 [PMID: 26384415]
  3. Bioinformatics. 2014 May 15;30(10):1363-9 [PMID: 24478339]
  4. Endocrinology. 2020 Jun 1;161(6): [PMID: 32242619]
  5. BMC Genomics. 2017 Jan 3;18(1):4 [PMID: 28049437]
  6. Epigenetics. 2013 Nov;8(11):1141-52 [PMID: 23959097]
  7. Nucleic Acids Res. 2013 Jan;41(Database issue):D991-5 [PMID: 23193258]
  8. Nucleic Acids Res. 2022 Jan 7;50(D1):D1004-D1009 [PMID: 34718752]
  9. J Natl Cancer Inst. 2013 May 15;105(10):694-700 [PMID: 23578854]
  10. PLoS Genet. 2018 Aug 9;14(8):e1007544 [PMID: 30091980]
  11. Epigenetics. 2013 Mar;8(3):333-46 [PMID: 23422812]
  12. Nucleic Acids Res. 2020 Jan 8;48(D1):D890-D895 [PMID: 31584095]
  13. Genome Biol. 2015 Jan 24;16:14 [PMID: 25616342]
  14. Epigenetics Chromatin. 2019 Aug 9;12(1):51 [PMID: 31399127]
  15. Brief Bioinform. 2014 Nov;15(6):929-41 [PMID: 23990268]
  16. PLoS One. 2013 Jun 27;8(6):e67378 [PMID: 23826282]
  17. Circ Cardiovasc Genet. 2016 Oct;9(5):436-447 [PMID: 27651444]
  18. Proc Natl Acad Sci U S A. 2019 Jun 4;116(23):11370-11379 [PMID: 31113877]
  19. Epigenetics. 2021 Dec;16(12):1306-1316 [PMID: 33315530]
  20. Front Genet. 2020 Oct 27;11:538492 [PMID: 33193611]
  21. Epigenetics. 2018;13(1):19-32 [PMID: 29381404]
  22. Nucleic Acids Res. 2013 Apr;41(7):e90 [PMID: 23476028]
  23. Genome Biol. 2014 Dec 03;15(12):503 [PMID: 25599564]
  24. Epigenetics. 2015;10(7):662-9 [PMID: 26036609]
  25. Nucleic Acids Res. 2018 Jan 4;46(D1):D288-D295 [PMID: 29161430]
  26. Bioinformatics. 2016 Sep 1;32(17):2659-63 [PMID: 27153672]
  27. Nat Rev Genet. 2013 Mar;14(3):204-20 [PMID: 23400093]
  28. Bioinformatics. 2017 Feb 15;33(4):558-560 [PMID: 28035024]
  29. Genome Biol. 2013;14(10):R115 [PMID: 24138928]
  30. Nucleic Acids Res. 2019 Jan 8;47(D1):D983-D988 [PMID: 30364969]
  31. Nat Genet. 2017 Apr;49(4):635-642 [PMID: 28263317]
  32. Nat Biotechnol. 2013 Feb;31(2):142-7 [PMID: 23334450]
  33. Nat Rev Genet. 2010 Oct;11(10):733-9 [PMID: 20838408]
  34. Genome Biol. 2012 Jun 15;13(6):R44 [PMID: 22703947]

Word Cloud

Created with Highcharts 10.0.0HumanMethylationBeadChipDNAmethylationdataepigenome-wideassociationstudiespublicnormalizationbatchprobebiasIlluminaonecost-effectivemethodsquantifylevelssingle-baseresolutionacrosshumangenomemakesroutineplatformaccumulatedtensthousandsarraysamplesdatabasesprovidinggreatsupportintegrationanalysisHowevermajoritydepositedprocessedwithoutbackgroundprobeswidelyusedpresentGaussianmixturequantileGMQNreferencebasedmethodcorrectingeffectswellAvailabilityimplementation:https://githubcom/MengweiLi-project/gmqnGMQN:Reference-BasedMethodCorrectingBatchEffectsProbeBiaseffect

Similar Articles

Cited By