mLiftOver: harmonizing data across Infinium DNA methylation platforms.

Brian H Chen, Wanding Zhou
Author Information
  1. Brian H Chen: California Pacific Medical Center Research Institute, Sutter Health, San Francisco, CA 94143, United States.
  2. Wanding Zhou: Center for Computational and Genomic Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, United States. ORCID

Abstract

MOTIVATION: Infinium DNA methylation BeadChips are widely used for genome-wide DNA methylation profiling at the population scale. Recent updates to probe content and naming conventions in the EPIC version 2 (EPICv2) arrays have complicated integrating new data with previous Infinium array platforms, such as the MethylationEPIC (EPIC) and the HumanMethylation450 (HM450) BeadChip.
RESULTS: We present mLiftOver, a user-friendly tool that harmonizes probe ID, methylation level, and signal intensity data across different Infinium platforms. It manages probe replicates, missing data imputation, and platform-specific bias for accurate data conversion. We validated the tool by applying HM450-based cancer classifiers to EPICv2 cancer data, achieving high accuracy. Additionally, we successfully integrated EPICv2 healthy tissue data with legacy HM450 data for tissue identity analysis and produced consistent copy number profiles in cancer cells.
AVAILABILITY AND IMPLEMENTATION: mLiftOver is implemented R and available in the Bioconductor package SeSAMe (version 1.21.13+): https://bioconductor.org/packages/release/bioc/html/sesame.html. Analysis of EPIC and EPICv2 platform-specific bias and high-confidence mapping is available at https://github.com/zhou-lab/InfiniumAnnotationV1/raw/main/Anno/EPICv2/EPICv2ToEPIC_conversion.tsv.gz. The source code is available at https://github.com/zwdzwd/sesame/blob/devel/R/mLiftOver.R under the MIT license.

References

  1. NAR Genom Bioinform. 2021 Apr 22;3(2):lqab025 [PMID: 33937763]
  2. Front Genet. 2022 Jul 04;13:831452 [PMID: 35860466]
  3. Nucleic Acids Res. 2018 Nov 16;46(20):e123 [PMID: 30085201]
  4. Nat Genet. 2022 Jan;54(1):18-29 [PMID: 34980917]
  5. Genome Res. 2019 Mar;29(3):472-484 [PMID: 30737237]
  6. Epigenetics. 2023 Dec;18(1):2185742 [PMID: 36871255]
  7. Brief Bioinform. 2023 Jan 19;24(1): [PMID: 36617464]
  8. Genome Res. 2006 Mar;16(3):383-93 [PMID: 16449502]
  9. NPJ Genom Med. 2022 Aug 25;7(1):50 [PMID: 36008412]
  10. Sci Rep. 2019 Jul 17;9(1):10383 [PMID: 31316107]
  11. Science. 2023 Aug 11;381(6658):eabq5693 [PMID: 37561875]
  12. Nucleic Acids Res. 2017 Feb 28;45(4):e22 [PMID: 27924034]
  13. Nucleic Acids Res. 2020 Jan 8;48(D1):D890-D895 [PMID: 31584095]
  14. Nat Methods. 2017 Oct 31;14(11):1023-1024 [PMID: 29088129]
  15. Epigenetics Commun. 2023;3(1): [PMID: 38455390]
  16. Epigenomics. 2009 Oct;1(1):177-200 [PMID: 22122642]
  17. Wellcome Open Res. 2022 Feb 4;7:41 [PMID: 35592546]
  18. Nat Commun. 2022 Sep 21;13(1):5523 [PMID: 36130950]
  19. Nucleic Acids Res. 2024 Apr 24;52(7):e38 [PMID: 38407446]
  20. Am J Hum Genet. 2020 Mar 5;106(3):356-370 [PMID: 32109418]
  21. Proc Natl Acad Sci U S A. 2017 Dec 19;114(51):13525-13530 [PMID: 29203669]
  22. Cell Genom. 2022 Jul 13;2(7): [PMID: 35873672]
  23. Nature. 2018 Mar 22;555(7697):469-474 [PMID: 29539639]
  24. Nat Commun. 2022 Feb 10;13(1):783 [PMID: 35145108]
  25. Nat Genet. 2021 Sep;53(9):1311-1321 [PMID: 34493871]
  26. Bioinformatics. 2013 Nov 15;29(22):2884-91 [PMID: 23990415]
  27. J Mach Learn Res. 2010 Mar 1;11:2287-2322 [PMID: 21552465]
  28. Cancer Cell. 2014 Oct 13;26(4):577-90 [PMID: 25263941]
  29. Genome Biol. 2013;14(10):R115 [PMID: 24138928]
  30. Nucleic Acids Res. 2019 Jan 8;47(D1):D983-D988 [PMID: 30364969]

Grants

  1. R35 GM146978/NIGMS NIH HHS
  2. R35-GM146978/National Institute of Health/National Institute of General Medical Sciences

MeSH Term

DNA Methylation
Humans
Software
Neoplasms
Oligonucleotide Array Sequence Analysis
Genome, Human

Word Cloud

Created with Highcharts 10.0.0dataInfiniummethylationEPICv2DNAprobeEPICplatformscanceravailableversionHM450mLiftOvertoolacrossplatform-specificbiastissueRhttps://githubMOTIVATION:BeadChipswidelyusedgenome-wideprofilingpopulationscaleRecentupdatescontentnamingconventions2arrayscomplicatedintegratingnewpreviousarrayMethylationEPICHumanMethylation450BeadChipRESULTS:presentuser-friendlyharmonizesIDlevelsignalintensitydifferentmanagesreplicatesmissingimputationaccurateconversionvalidatedapplyingHM450-basedclassifiersachievinghighaccuracyAdditionallysuccessfullyintegratedhealthylegacyidentityanalysisproducedconsistentcopynumberprofilescellsAVAILABILITYANDIMPLEMENTATION:implementedBioconductorpackageSeSAMe12113+:https://bioconductororg/packages/release/bioc/html/sesamehtmlAnalysishigh-confidencemappingcom/zhou-lab/InfiniumAnnotationV1/raw/main/Anno/EPICv2/EPICv2ToEPIC_conversiontsvgzsourcecodecom/zwdzwd/sesame/blob/devel/R/mLiftOverMITlicensemLiftOver:harmonizing

Similar Articles

Cited By (1)