Single nucleotide polymorphism marker combinations for classifying Yeonsan Ogye chicken using a machine learning approach.

Eunjin Cho, Sunghyun Cho, Minjun Kim, Thisarani Kalhari Ediriweera, Dongwon Seo, Seung-Sook Lee, Jihye Cha, Daehyeok Jin, Young-Kuk Kim, Jun Heon Lee
Author Information
  1. Eunjin Cho: Department of Bio-AI Convergence, Chungnam National University, Daejeon 34134, Korea. ORCID
  2. Sunghyun Cho: Research and Development Center, Insilicogen Inc., Yongin 19654, Korea. ORCID
  3. Minjun Kim: Division of Animal and Dairy Science, Chungnam National University, Daejeon 34134, Korea. ORCID
  4. Thisarani Kalhari Ediriweera: Department of Bio-AI Convergence, Chungnam National University, Daejeon 34134, Korea. ORCID
  5. Dongwon Seo: Department of Bio-AI Convergence, Chungnam National University, Daejeon 34134, Korea. ORCID
  6. Seung-Sook Lee: Yeonsan Ogye Foundation, Nonsan 32910, Korea. ORCID
  7. Jihye Cha: Animal Genome & Bioinformatics, National Institute of Animal Science, Rural Development Administration, Wanju 55365, Korea. ORCID
  8. Daehyeok Jin: Animal Genetic Resources Research Center, National Institute of Animal Science, Rural Development Administration, Hamyang 50000, Korea. ORCID
  9. Young-Kuk Kim: Department of Bio-AI Convergence, Chungnam National University, Daejeon 34134, Korea. ORCID
  10. Jun Heon Lee: Department of Bio-AI Convergence, Chungnam National University, Daejeon 34134, Korea. ORCID

Abstract

Genetic analysis has great potential as a tool to differentiate between different species and breeds of livestock. In this study, the optimal combinations of single nucleotide polymorphism (SNP) markers for discriminating the Yeonsan Ogye chicken () breed were identified using high-density 600K SNP array data. In 3,904 individuals from 198 chicken breeds, SNP markers specific to the target population were discovered through a case-control genome-wide association study (GWAS) and filtered out based on the linkage disequilibrium blocks. Significant SNP markers were selected by feature selection applying two machine learning algorithms: Random Forest (RF) and AdaBoost (AB). Using a machine learning approach, the 38 (RF) and 43 (AB) optimal SNP marker combinations for the Yeonsan Ogye chicken population demonstrated 100% accuracy. Hence, the GWAS and machine learning models used in this study can be efficiently utilized to identify the optimal combination of markers for discriminating target populations using multiple SNP markers.

Keywords

References

  1. Meat Sci. 2010 Jun;85(2):285-8 [PMID: 20374900]
  2. PLoS Genet. 2014 Dec 04;10(12):e1004845 [PMID: 25474422]
  3. BMC Genomics. 2013 Jan 28;14:59 [PMID: 23356797]
  4. Nat Genet. 2016 Oct;48(10):1284-1287 [PMID: 27571263]
  5. BMC Genet. 2009 Sep 29;10:61 [PMID: 19785776]
  6. J Anim Sci Technol. 2015 Feb 05;57:5 [PMID: 26290725]
  7. Anim Genet. 2009 Jun;40(3):353-6 [PMID: 19292709]
  8. Animals (Basel). 2021 Jan 19;11(1): [PMID: 33477975]
  9. Sci Rep. 2016 Sep 08;6:32894 [PMID: 27604177]
  10. BMC Genomics. 2019 May 7;20(1):345 [PMID: 31064348]
  11. Mamm Genome. 2002 May;13(5):272-81 [PMID: 12016516]
  12. Proc Natl Acad Sci U S A. 1997 Jan 21;94(2):565-8 [PMID: 9012824]
  13. PLoS Genet. 2011 Dec;7(12):e1002412 [PMID: 22216010]
  14. Anim Genet. 2010 May;41 Suppl 1:32-46 [PMID: 20500754]
  15. Sci Rep. 2015 May 19;5:10312 [PMID: 25988841]
  16. BMC Med Inform Decis Mak. 2019 Dec 21;19(1):281 [PMID: 31864346]
  17. BMC Bioinformatics. 2006 Jan 06;7:4 [PMID: 16398931]
  18. PLoS One. 2017 Apr 5;12(4):e0173147 [PMID: 28379963]
  19. BMC Genomics. 2017 Jan 11;18(1):69 [PMID: 28077077]
  20. BMC Genomics. 2011 May 31;12(1):274 [PMID: 21627800]
  21. BMC Genet. 2011 May 13;12:45 [PMID: 21569514]
  22. Am J Hum Genet. 2007 Sep;81(3):559-75 [PMID: 17701901]
  23. Asian-Australas J Anim Sci. 2014 Jul;27(7):926-31 [PMID: 25050032]
  24. Food Sci Anim Resour. 2022 Jan;42(1):111-127 [PMID: 35028578]
  25. BMC Genomics. 2010 Dec 22;11:724 [PMID: 21176216]
  26. Animal. 2020 Feb;14(2):223-232 [PMID: 31603060]

Word Cloud

Created with Highcharts 10.0.0SNPmarkersYeonsanOgyechickenmachinelearningstudyoptimalcombinationsnucleotidepolymorphismusingbreedsdiscriminatingtargetpopulationGWASRFABapproachmarkerSingleGeneticanalysisgreatpotentialtooldifferentiatedifferentspecieslivestocksinglebreedidentifiedhigh-density600Karraydata3904individuals198specificdiscoveredcase-controlgenome-wideassociationfilteredbasedlinkagedisequilibriumblocksSignificantselectedfeatureselectionapplyingtwoalgorithms:RandomForestAdaBoostUsing3843demonstrated100%accuracyHencemodelsusedcanefficientlyutilizedidentifycombinationpopulationsmultipleclassifyingBreedidentificationMachine

Similar Articles

Cited By

No available data.