Machine Learning Models for Genetic Risk Assessment of Infants with Non-syndromic Orofacial Cleft.

Shi-Jian Zhang, Peiqi Meng, Jieni Zhang, Peizeng Jia, Jiuxiang Lin, Xiangfeng Wang, Feng Chen, Xiaoxing Wei
Author Information
  1. Shi-Jian Zhang: Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China.
  2. Peiqi Meng: Department of Orthodontics & Central Laboratory, Peking University School and Hospital of Stomatology, Beijing 100081, China.
  3. Jieni Zhang: Department of Orthodontics & Central Laboratory, Peking University School and Hospital of Stomatology, Beijing 100081, China.
  4. Peizeng Jia: Department of Orthodontics & Central Laboratory, Peking University School and Hospital of Stomatology, Beijing 100081, China.
  5. Jiuxiang Lin: Department of Orthodontics & Central Laboratory, Peking University School and Hospital of Stomatology, Beijing 100081, China.
  6. Xiangfeng Wang: Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China.
  7. Feng Chen: Department of Orthodontics & Central Laboratory, Peking University School and Hospital of Stomatology, Beijing 100081, China. Electronic address: chenfeng2011@hsc.pku.edu.cn.
  8. Xiaoxing Wei: State Key Laboratory of Plateau Ecology and Agriculture, Medical College of Qinghai University, Xining 810016, China. Electronic address: weixiaoxing@tsinghua.org.cn.

Abstract

The isolated type of orofacial cleft, termed non-syndromic cleft lip with or without cleft palate (NSCL/P), is the second most common birth defect in China, with Asians having the highest incidence in the world. NSCL/P involves multiple genes and complex interactions between genetic and environmental factors, imposing difficulty for the genetic assessment of the unborn fetus carrying multiple NSCL/P-susceptible variants. Although genome-wide association studies (GWAS) have uncovered dozens of single nucleotide polymorphism (SNP) loci in different ethnic populations, the genetic diagnostic effectiveness of these SNPs requires further experimental validation in Chinese populations before a diagnostic panel or a predictive model covering multiple SNPs can be built. In this study, we collected blood samples from control and NSCL/P infants in Han and Uyghur Chinese populations to validate the diagnostic effectiveness of 43 candidate SNPs previously detected using GWAS. We then built predictive models with the validated SNPs using different machine learning algorithms and evaluated their prediction performance. Our results showed that logistic regression had the best performance for risk assessment according to the area under curve. Notably, defective variants in MTHFR and RBP4, two genes involved in folic acid and vitamin A biosynthesis, were found to have high contributions to NSCL/P incidence based on feature importance evaluation with logistic regression. This is consistent with the notion that folic acid and vitamin A are both essential nutritional supplements for pregnant women to reduce the risk of conceiving an NSCL/P baby. Moreover, we observed a lower predictive power in Uyghur than in Han cases, likely due to differences in genetic background between these two ethnic populations. Thus, our study highlights the urgency to generate the HapMap for Uyghur population and perform resequencing-based screening of Uyghur-specific NSCL/P markers.

Keywords

References

  1. Genomics Proteomics Bioinformatics. 2016 Oct;14(5):253-261 [PMID: 27744061]
  2. Nat Commun. 2017 Feb 24;8:14364 [PMID: 28232668]
  3. Prev Med. 2004 Oct;39(4):689-94 [PMID: 15351534]
  4. Nat Genet. 2012 Sep;44(9):968-71 [PMID: 22863734]
  5. Arch Toxicol. 1992;66(9):652-9 [PMID: 1482289]
  6. Nat Genet. 1995 May;10(1):111-3 [PMID: 7647779]
  7. Mol Genet Metab. 1998 Jul;64(3):169-72 [PMID: 9719624]
  8. J Dent Res. 2014 Jun;93(6):547-52 [PMID: 24695672]
  9. BMJ. 2007 Mar 3;334(7591):464 [PMID: 17259187]
  10. Genet Epidemiol. 2009 Apr;33(3):247-55 [PMID: 19048631]
  11. Biotechniques. 2001 Nov;31(5):1106-16, 1118, 1120-1 [PMID: 11730017]
  12. Oral Dis. 2018 Jul;24(5):820-828 [PMID: 29356306]
  13. PLoS One. 2015 Sep 16;10(9):e0137547 [PMID: 26375920]
  14. Ann Bot. 2013 Apr;111(4):731-42 [PMID: 23422023]
  15. Dev Dyn. 2006 May;235(5):1152-66 [PMID: 16292776]
  16. Teratology. 1995 Feb;51(2):71-8 [PMID: 7660324]
  17. Nat Genet. 2009 Apr;41(4):473-7 [PMID: 19270707]
  18. Oral Dis. 2010 Jan;16(1):3-10 [PMID: 19656316]
  19. Am J Med Genet. 1998 Nov 16;80(3):196-8 [PMID: 9843036]
  20. Nutr Rev. 2011 Oct;69(10):613-24 [PMID: 21967161]
  21. N Engl J Med. 2004 Aug 19;351(8):769-80 [PMID: 15317890]
  22. Am J Epidemiol. 2003 Apr 1;157(7):583-91 [PMID: 12672677]
  23. Indian J Clin Biochem. 2018 Jan;33(1):5-15 [PMID: 29371764]
  24. Lancet. 1997 May 31;349(9065):1591-3 [PMID: 9174561]
  25. Nat Genet. 2010 Jun;42(6):525-9 [PMID: 20436469]
  26. Birth Defects Res A Clin Mol Teratol. 2007 Jan;79(1):8-15 [PMID: 17133404]
  27. Am J Hum Genet. 1998 May;62(5):1044-51 [PMID: 9545395]
  28. Am J Epidemiol. 2003 Jul 1;158(1):69-76 [PMID: 12835288]
  29. Blood. 2011 Oct 20;118(16):4463-71 [PMID: 21868574]
  30. Nat Commun. 2015 Mar 16;6:6414 [PMID: 25775280]
  31. Endocr Rev. 1989 Aug;10(3):308-16 [PMID: 2550213]
  32. Nat Rev Genet. 2011 Mar;12(3):167-78 [PMID: 21331089]
  33. Nat Genet. 1994 Jun;7(2):195-200 [PMID: 7920641]

MeSH Term

Asian People
China
Cleft Lip
Cleft Palate
Genome-Wide Association Study
Humans
Infant
Logistic Models
Machine Learning
Methylenetetrahydrofolate Reductase (NADPH2)
Polymorphism, Single Nucleotide
Retinol-Binding Proteins, Plasma
Risk Assessment

Chemicals

RBP4 protein, human
Retinol-Binding Proteins, Plasma
MTHFR protein, human
Methylenetetrahydrofolate Reductase (NADPH2)

Word Cloud

Created with Highcharts 10.0.0NSCL/PcleftgeneticpopulationsSNPsmultiplediagnosticpredictiveUyghurriskacidincidencegenesassessmentvariantsGWASdifferentethniceffectivenessChinesebuiltstudyHanusingperformancelogisticregressiontwofolicvitaminGeneticOrofacialisolatedtypeorofacialtermednon-syndromiclipwithoutpalatesecondcommonbirthdefectChinaAsianshighestworldinvolvescomplexinteractionsenvironmentalfactorsimposingdifficultyunbornfetuscarryingNSCL/P-susceptibleAlthoughgenome-wideassociationstudiesuncovereddozenssinglenucleotidepolymorphismSNPlocirequiresexperimentalvalidationpanelmodelcoveringcancollectedbloodsamplescontrolinfantsvalidate43candidatepreviouslydetectedmodelsvalidatedmachinelearningalgorithmsevaluatedpredictionresultsshowedbestaccordingareacurveNotablydefectiveMTHFRRBP4involvedbiosynthesisfoundhighcontributionsbasedfeatureimportanceevaluationconsistentnotionessentialnutritionalsupplementspregnantwomenreduceconceivingbabyMoreoverobservedlowerpowercaseslikelyduedifferencesbackgroundThushighlightsurgencygenerateHapMappopulationperformresequencing-basedscreeningUyghur-specificmarkersMachineLearningModelsRiskAssessmentInfantsNon-syndromicCleftFolicNutritionalinterventionVitamin

Similar Articles

Cited By