A novel approach toward optimal workflow selection for DNA methylation biomarker discovery.

Naghme Nazer, Mohammad Hossein Sepehri, Hoda Mohammadzade, Mahya Mehrmohamadi
Author Information
  1. Naghme Nazer: Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran.
  2. Mohammad Hossein Sepehri: Department of Biotechnology, College of Science, University of Tehran, Tehran, Iran.
  3. Hoda Mohammadzade: Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran.
  4. Mahya Mehrmohamadi: Department of Biotechnology, College of Science, University of Tehran, Tehran, Iran. mehrmohamadi@ut.ac.ir.

Abstract

DNA methylation is a major epigenetic modification involved in many physiological processes. Normal methylation patterns are disrupted in many diseases and methylation-based biomarkers have shown promise in several contexts. Marker discovery typically involves the analysis of publicly available DNA methylation data from high-throughput assays. Numerous methods for identification of differentially methylated biomarkers have been developed, making the need for best practices guidelines and context-specific analyses workflows exceedingly high. To this end, here we propose TASA, a novel method for simulating methylation array data in various scenarios. We then comprehensively assess different data analysis workflows using real and simulated data and suggest optimal start-to-finish analysis workflows. Our study demonstrates that the choice of analysis pipeline for DNA methylation-based marker discovery is crucial and different across different contexts.

Keywords

References

  1. Ann Rheum Dis. 2019 Nov;78(11):1505-1516 [PMID: 31371305]
  2. Epigenetics Chromatin. 2015 Jan 27;8:6 [PMID: 25972926]
  3. Nat Methods. 2015 May;12(5):453-7 [PMID: 25822800]
  4. Arthritis Rheumatol. 2021 Dec;73(12):2229-2239 [PMID: 34105306]
  5. Nucleic Acids Res. 2013 Jun;41(11):e117 [PMID: 23598999]
  6. Bioinformatics. 2012 Nov 15;28(22):2986-8 [PMID: 22954632]
  7. Immunity. 2021 Nov 9;54(11):2465-2480.e5 [PMID: 34706222]
  8. Epigenetics. 2015;10(7):662-9 [PMID: 26036609]
  9. Bioinformatics. 2021 May 5;37(5):711-713 [PMID: 32805005]
  10. Bioinformatics. 2013 Jan 15;29(2):189-96 [PMID: 23175756]
  11. PLoS Genet. 2021 Mar 19;17(3):e1009443 [PMID: 33739972]
  12. J Biotechnol. 2017 Nov 10;261:105-115 [PMID: 28822795]
  13. Methods. 2015 Jan 15;72:21-8 [PMID: 25461817]
  14. Bioinformatics. 2016 Sep 1;32(17):2604-10 [PMID: 27187204]
  15. Bioinformatics. 2014 May 15;30(10):1363-9 [PMID: 24478339]
  16. Biomed Res Int. 2018 Nov 18;2018:1070645 [PMID: 30581840]
  17. Brief Bioinform. 2019 Nov 27;20(6):2224-2235 [PMID: 30239597]
  18. Biostatistics. 2007 Jan;8(1):118-27 [PMID: 16632515]
  19. Nucleic Acids Res. 2002 Jan 1;30(1):207-10 [PMID: 11752295]
  20. Clin Epigenetics. 2021 Dec 18;13(1):226 [PMID: 34922619]
  21. Arthritis Rheumatol. 2017 Mar;69(3):550-559 [PMID: 27723282]
  22. BMC Bioinformatics. 2022 Sep 5;23(1):364 [PMID: 36064314]
  23. Epigenetics. 2020 Jan - Feb;15(1-2):174-182 [PMID: 31538540]
  24. Oncotarget. 2017 Nov 29;8(70):114648-114662 [PMID: 29383109]
  25. Sci Rep. 2015 Aug 19;5:13107 [PMID: 26286994]
  26. Clin Epigenetics. 2018 Oct 16;10(1):123 [PMID: 30326963]
  27. EBioMedicine. 2022 Jun;80:104053 [PMID: 35576644]
  28. Genome Biol. 2021 Mar 26;22(1):90 [PMID: 33771206]
  29. Epigenetics Chromatin. 2020 Nov 23;13(1):51 [PMID: 33228774]
  30. BMC Res Notes. 2021 Sep 8;14(1):352 [PMID: 34496950]
  31. Nat Commun. 2018 Aug 13;9(1):3220 [PMID: 30104673]
  32. Nucleic Acids Res. 2021 Nov 8;49(19):e109 [PMID: 34320181]
  33. Cancers (Basel). 2021 Dec 15;13(24): [PMID: 34944912]
  34. Nat Commun. 2014 Nov 18;5:5366 [PMID: 25404168]
  35. Epigenetics. 2013 Mar;8(3):333-46 [PMID: 23422812]
  36. Clin Epigenetics. 2017 Feb 3;9:13 [PMID: 28174608]
  37. Sci Rep. 2019 Jul 17;9(1):10383 [PMID: 31316107]
  38. BMC Bioinformatics. 2018 Apr 11;19(Suppl 5):115 [PMID: 29671397]
  39. Genomics. 2020 Jan;112(1):144-150 [PMID: 31078719]
  40. Gigascience. 2020 May 1;9(5): [PMID: 32401319]
  41. Gastric Cancer. 2023 Jan;26(1):95-107 [PMID: 36224483]
  42. Nat Genet. 2017 Apr;49(4):635-642 [PMID: 28263317]
  43. Epigenetics Chromatin. 2015 Dec 01;8:51 [PMID: 26628921]
  44. Epigenetics. 2013 Feb;8(2):203-9 [PMID: 23314698]
  45. Genome Biol. 2019 Mar 14;20(1):55 [PMID: 30871603]
  46. BMC Bioinformatics. 2020 Nov 18;21(Suppl 6):403 [PMID: 33203349]
  47. Int J Epidemiol. 2012 Feb;41(1):200-9 [PMID: 22422453]
  48. BMC Genomics. 2013 May 01;14:293 [PMID: 23631413]
  49. Bioinformatics. 2017 Dec 15;33(24):3982-3984 [PMID: 28961746]
  50. J Gastroenterol. 2022 Mar;57(3):144-155 [PMID: 35034200]

Grants

  1. 99012262/Iran National Science Foundation

MeSH Term

DNA Methylation
Workflow
Epigenesis, Genetic
Biomedical Research
Data Analysis

Word Cloud

Created with Highcharts 10.0.0methylationDNAanalysisdatadiscoveryworkflowsdifferentmanymethylation-basedbiomarkerscontextsnovelarrayoptimalpipelinemarkermajorepigeneticmodificationinvolvedphysiologicalprocessesNormalpatternsdisrupteddiseasesshownpromiseseveralMarkertypicallyinvolvespubliclyavailablehigh-throughputassaysNumerousmethodsidentificationdifferentiallymethylateddevelopedmakingneedbestpracticesguidelinescontext-specificanalysesexceedinglyhighendproposeTASAmethodsimulatingvariousscenarioscomprehensivelyassessusingrealsimulatedsuggeststart-to-finishstudydemonstrateschoicecrucialacrossapproachtowardworkflowselectionbiomarkerDataoptimizationSimulation

Similar Articles

Cited By

No available data.