DTi2Vec: Drug-target interaction prediction using network embedding and ensemble learning.

Maha A Thafar, Rawan S Olayan, Somayah Albaradei, Vladimir B Bajic, Takashi Gojobori, Magbubah Essack, Xin Gao
Author Information
  1. Maha A Thafar: Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
  2. Rawan S Olayan: The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
  3. Somayah Albaradei: Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
  4. Vladimir B Bajic: Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
  5. Takashi Gojobori: Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
  6. Magbubah Essack: Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia. magbubah.essack@kaust.edu.sa. ORCID
  7. Xin Gao: Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia. xin.gao@kaust.edu.sa.

Abstract

Drug-target interaction (DTI) prediction is a crucial step in drug discovery and repositioning as it reduces experimental validation costs if done right. Thus, developing in-silico methods to predict potential DTI has become a competitive research niche, with one of its main focuses being improving the prediction accuracy. Using machine learning (ML) models for this task, specifically network-based approaches, is effective and has shown great advantages over the other computational methods. However, ML model development involves upstream hand-crafted feature extraction and other processes that impact prediction accuracy. Thus, network-based representation learning techniques that provide automated feature extraction combined with traditional ML classifiers dealing with downstream link prediction tasks may be better-suited paradigms. Here, we present such a method, DTi2Vec, which identifies DTIs using network representation learning and ensemble learning techniques. DTi2Vec constructs the heterogeneous network, and then it automatically generates features for each drug and target using the nodes embedding technique. DTi2Vec demonstrated its ability in drug-target link prediction compared to several state-of-the-art network-based methods, using four benchmark datasets and large-scale data compiled from DrugBank. DTi2Vec showed a statistically significant increase in the prediction performances in terms of AUPR. We verified the "novel" predicted DTIs using several databases and scientific literature. DTi2Vec is a simple yet effective method that provides high DTI prediction performance while being scalable and efficient in computation, translating into a powerful drug repositioning tool.

Keywords

References

  1. Brief Bioinform. 2020 Jan 17;21(1):182-197 [PMID: 30535359]
  2. J Cheminform. 2020 Jul 22;12(1):46 [PMID: 33431024]
  3. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D431-3 [PMID: 14681450]
  4. J Cheminform. 2020 Jun 29;12(1):44 [PMID: 33431036]
  5. Sci Rep. 2017 Jan 12;7:40376 [PMID: 28079135]
  6. Med Res Rev. 2006 Sep;26(5):531-68 [PMID: 16758486]
  7. Brief Bioinform. 2019 Jul 19;20(4):1337-1357 [PMID: 29377981]
  8. Nat Commun. 2017 Sep 18;8(1):573 [PMID: 28924171]
  9. Br J Pharmacol. 2016 Dec;173(23):3372-3385 [PMID: 27646592]
  10. Nucleic Acids Res. 2016 Jan 4;44(D1):D1202-13 [PMID: 26400175]
  11. Methods Mol Biol. 2019;1903:317-328 [PMID: 30547451]
  12. Curr Med Chem. 2020;27(35):5856-5886 [PMID: 31393241]
  13. Sci Rep. 2017 Dec 18;7(1):17731 [PMID: 29255285]
  14. Bioinformatics. 2020 Feb 15;36(4):1241-1251 [PMID: 31584634]
  15. Curr Drug Metab. 2019;20(3):194-202 [PMID: 30129407]
  16. Nucleic Acids Res. 2010 Jul;38(Web Server issue):W652-6 [PMID: 20460463]
  17. J Biomed Inform. 2019 May;93:103159 [PMID: 30926470]
  18. PLoS Comput Biol. 2019 Jun 14;15(6):e1007129 [PMID: 31199797]
  19. Bioinformatics. 2017 Nov 15;33(22):3575-3583 [PMID: 28961686]
  20. Methods. 2019 Aug 15;166:4-21 [PMID: 31022451]
  21. PeerJ Comput Sci. 2021 Feb 18;7:e341 [PMID: 33816992]
  22. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D354-7 [PMID: 16381885]
  23. Front Genet. 2019 May 31;10:459 [PMID: 31214240]
  24. Nat Methods. 2014 Mar;11(3):333-7 [PMID: 24464287]
  25. Expert Opin Drug Metab Toxicol. 2014 Sep;10(9):1273-87 [PMID: 25112457]
  26. J Mol Cell Biol. 2021 Feb 15;12(11):823-827 [PMID: 32573721]
  27. Med Chem. 2007 Jan;3(1):107-13 [PMID: 17266630]
  28. J Cheminform. 2016 Mar 16;8:15 [PMID: 26985240]
  29. Nucleic Acids Res. 2017 Jan 4;45(D1):D353-D361 [PMID: 27899662]
  30. Bioinformatics. 2009 Sep 15;25(18):2397-403 [PMID: 19605421]
  31. Bioinformatics. 2020 Jan 15;36(2):603-610 [PMID: 31368482]
  32. Bioinformatics. 2008 Jul 1;24(13):i232-40 [PMID: 18586719]
  33. In Silico Pharmacol. 2013 Dec 05;1:17 [PMID: 25505661]
  34. F1000Res. 2017 Feb 7;6:113 [PMID: 28232867]
  35. Brief Bioinform. 2014 Sep;15(5):734-47 [PMID: 23933754]
  36. Nucleic Acids Res. 2017 Jan 4;45(D1):D945-D954 [PMID: 27899562]
  37. Front Pharmacol. 2018 Oct 09;9:1134 [PMID: 30356768]
  38. PLoS One. 2012;7(7):e41064 [PMID: 22815915]
  39. Nucleic Acids Res. 2008 Jan;36(Database issue):D919-22 [PMID: 17942422]
  40. KDD. 2016 Aug;2016:855-864 [PMID: 27853626]
  41. Database (Oxford). 2011 Mar 29;2011:bar009 [PMID: 21447597]
  42. Front Chem. 2019 Nov 20;7:782 [PMID: 31824921]
  43. Chem Sci. 2020 Jan 8;11(9):2531-2557 [PMID: 33209251]
  44. PLoS Comput Biol. 2016 Feb 12;12(2):e1004760 [PMID: 26872142]
  45. Nat Commun. 2019 Oct 30;10(1):4941 [PMID: 31666519]
  46. Nucleic Acids Res. 2008 Jan;36(Database issue):D901-6 [PMID: 18048412]
  47. Bioinformatics. 2018 Apr 1;34(7):1164-1173 [PMID: 29186331]
  48. Brief Bioinform. 2016 Jul;17(4):696-712 [PMID: 26283676]
  49. PLoS One. 2012;7(3):e33174 [PMID: 22432004]
  50. Mol Biosyst. 2012 Jul 6;8(7):1970-8 [PMID: 22538619]
  51. Chem Soc Rev. 2013 Mar 7;42(5):2130-41 [PMID: 23288298]
  52. Methods Mol Biol. 2018;1762:21-30 [PMID: 29594765]
  53. Bioinformatics. 2020 May 1;36(9):2805-2812 [PMID: 31971579]
  54. Bioinformatics. 2017 Sep 01;33(17):2723-2730 [PMID: 28449114]
  55. Nucleic Acids Res. 2018 Jan 4;46(D1):D1074-D1082 [PMID: 29126136]
  56. Nucleic Acids Res. 2017 Jan 4;45(D1):D972-D978 [PMID: 27651457]

Grants

  1. FCC/1/1976-26-01/King Abdullah University of Science and Technology
  2. BAS/1/1606-01-01/King Abdullah University of Science and Technology
  3. BAS/1/1059-01-01/King Abdullah University of Science and Technology
  4. BAS/1/1624-01-01/King Abdullah University of Science and Technology
  5. FCC/1/1976-17-01/King Abdullah University of Science and Technology

Word Cloud

Created with Highcharts 10.0.0predictionlearningDTi2VecusingnetworkinteractionDTIdrugrepositioningmethodsMLnetwork-basedembeddingDrug-targetThusaccuracyeffectivefeatureextractionrepresentationtechniqueslinkmethodDTIsensembleseveralcrucialstepdiscoveryreducesexperimentalvalidationcostsdonerightdevelopingin-silicopredictpotentialbecomecompetitiveresearchnicheonemainfocusesimprovingUsingmachinemodelstaskspecificallyapproachesshowngreatadvantagescomputationalHowevermodeldevelopmentinvolvesupstreamhand-craftedprocessesimpactprovideautomatedcombinedtraditionalclassifiersdealingdownstreamtasksmaybetter-suitedparadigmspresentidentifiesconstructsheterogeneousautomaticallygeneratesfeaturestargetnodestechniquedemonstratedabilitydrug-targetcomparedstate-of-the-artfourbenchmarkdatasetslarge-scaledatacompiledDrugBankshowedstatisticallysignificantincreaseperformancestermsAUPRverified"novel"predicteddatabasesscientificliteraturesimpleyetprovideshighperformancescalableefficientcomputationtranslatingpowerfultoolDTi2Vec:CheminformaticsDrugDrug–targetEnsembleHeterogeneousLinkNetworkRandomwalkRepresentation

Similar Articles

Cited By