Drug-target Interaction Prediction By Combining Transformer and Graph Neural Networks


Cite item

Full Text

Abstract

Background:The prediction of drug-target interactions (DTIs) plays an essential role in drug discovery. Recently, deep learning methods have been widely applied in DTI prediction. However, most of the existing research does not fully utilize the molecular structures of drug compounds and the sequence structures of proteins, which makes these models unable to obtain precise and effective feature representations.

Methods:In this study, we propose a novel deep learning framework combining transformer and graph neural networks for predicting DTIs. Our model utilizes graph convolutional neural networks to capture the global and local structure information of drugs, and convolutional neural networks are employed to capture the sequence feature of targets. In addition, the obtained drug and protein representations are input to multi-layer transformer encoders, respectively, to integrate their features and generate final representations.

Results:The experiments on benchmark datasets demonstrated that our model outperforms previous graph-based and transformer-based methods, with 1.5% and 1.8% improvement in precision and 0.2% and 1.0% improvement in recall, respectively. The results indicate that the transformer encoders effectively extract feature information of both drug compounds and proteins.

Conclusion:Overall, our proposed method validates the applicability of combining graph neural networks and transformer architecture in drug discovery, and due to the attention mechanisms, it can extract deep structure feature data of drugs and proteins.

About the authors

Junkai Liu

School of Electronic and Information Engineering, Suzhou University of Science and Technology

Email: info@benthamscience.net

Yaoyao Lu

Suzhou University of Science and Technology, School of Electronic and Information Engineering

Email: info@benthamscience.net

Shixuan Guan

School of Electronic and Information Engineering, Suzhou University of Science and Technology

Email: info@benthamscience.net

Tengsheng Jiang

Gusu School, Nanjing Medical University

Email: info@benthamscience.net

Yijie Ding

Yangtze Delta Region Institute(Quzhou), University of Electronic Science and Technology of Chin

Email: info@benthamscience.net

Qiming Fu

School of Electronic and Information Engineering, Suzhou University of Science and Technology

Email: info@benthamscience.net

Zhiming Cui

School of Electronic and Information Engineering, Suzhou University of Science and Technology

Email: info@benthamscience.net

Hongjie Wu

School of Electronic and Information Engineering, Suzhou University of Science and Technology

Author for correspondence.
Email: info@benthamscience.net

References

  1. Scannell JW, Blanckley A, Boldon H, Warrington B. Diagnosing the decline in pharmaceutical R&D efficiency. Nat Rev Drug Discov 2012; 11(3): 191-200. doi: 10.1038/nrd3681 PMID: 22378269
  2. Lin X, Li X, Lin X. A review on applications of computational methods in drug screening and design. Molecules 2020; 25(6): 1375. doi: 10.3390/molecules25061375 PMID: 32197324
  3. He Y, Shen Z, Zhang Q, Wang S, Huang DS. A survey on deep learning in DNA/RNA motif mining. Brief Bioinform 2021; 22(4): bbaa229. doi: 10.1093/bib/bbaa229 PMID: 33005921
  4. da Silva Rocha SFL, Olanda CG, Fokoue HH, Sant’Anna CMR. Virtual screening techniques in drug discovery: Review and recent applications. Curr Top Med Chem 2019; 19(19): 1751-67. doi: 10.2174/1568026619666190816101948 PMID: 31418662
  5. Guo X, Zhou W, Shi B, et al. An efficient multiple kernel support vector regression model for assessing dry weight of hemodialysis patients. Curr Bioinform 2021; 16(2): 284-93. doi: 10.2174/1574893615999200614172536
  6. Chuai G, Ma H, Yan J, et al. DeepCRISPR: Optimized CRISPR guide RNA design by deep learning. Genome Biol 2018; 19(1): 80. doi: 10.1186/s13059-018-1459-4 PMID: 29945655
  7. Chao WANG, Quan ZOU. A machine learning method for differentiating and predicting human‐infective coronavirus based on physicochemical features and composition of the spike protein. Chin J Electron 2021; 30(5): 815-23. doi: 10.1049/cje.2021.06.003
  8. Zhang F, Song H, Zeng M, et al. A deep learning framework for gene ontology annotations with sequence- and network-based information. IEEE/ACM Trans Comput Biol Bioinformatics 2021; 18(6): 2208-17. doi: 10.1109/TCBB.2020.2968882 PMID: 31985440
  9. Wang L, You ZH, Huang YA, Huang DS, Chan KCC. An efficient approach based on multi-sources information to predict circRNA – disease associations using deep convolutional neural network. Bioinformatics 2020; 36(13): 4038-46. doi: 10.1093/bioinformatics/btz825 PMID: 31793982
  10. Luo X, Tu X, Ding Y, Gao G, Deng M. Expectation pooling: An effective and interpretable pooling method for predicting DNA–protein binding. Bioinformatics 2020; 36(5): 1405-12. doi: 10.1093/bioinformatics/btz768 PMID: 31598637
  11. Kimber TB, Chen Y, Volkamer A. Deep learning in virtual screening: Recent applications and developments. Int J Mol Sci 2021; 22(9): 4435. doi: 10.3390/ijms22094435 PMID: 33922714
  12. Liu S, Wang Y, Deng Y, et al. Improved drug–target interaction prediction with intermolecular graph transformer. Brief Bioinform 2022; 23(5): bbac162. doi: 10.1093/bib/bbac162 PMID: 35514186
  13. Ding Y, Tang J, Guo F, Zou Q. Identification of drug–target interactions via multiple kernel-based triple collaborative matrix factorization. Brief Bioinform 2022; 23(2): bbab582. doi: 10.1093/bib/bbab582 PMID: 35134117
  14. Öztürk H, Özgür A, Ozkirimli E. DeepDTA: Deep drug–target binding affinity prediction. Bioinformatics 2018; 34(17): i821-9. doi: 10.1093/bioinformatics/bty593 PMID: 30423097
  15. Ozturk H, Ozkirimli E, Ozgur A. WideDTA: Prediction of drug-target binding affinity arXiv:190204166 2019.
  16. Lee I, Keum J, Nam H. DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences. PLOS Comput Biol 2019; 15(6): e1007129. doi: 10.1371/journal.pcbi.1007129 PMID: 31199797
  17. Zheng S, Li Y, Chen S, Xu J, Yang Y. Predicting drug–protein interaction using quasi-visual question answering system. Nat Mach Intell 2020; 2(2): 134-40. doi: 10.1038/s42256-020-0152-y
  18. Gao KY, Fokoue A, Luo H, et al. Interpretable drug target prediction using deep neural representationC. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Stockholm Sweden. 2018; pp. 3371-7. doi: 10.24963/ijcai.2018/468
  19. Karimi M, Wu D, Wang Z, Shen Y. Explainable deep relational networks for predicting compound–protein affinities and contacts. J Chem Inf Model 2021; 61(1): 46-66. doi: 10.1021/acs.jcim.0c00866 PMID: 33347301
  20. Wang YB, You ZH, Yang S, Yi HC, Chen ZH, Zheng K. A deep learning-based method for drug-target interaction prediction based on long short-term memory neural network. BMC Med Inform Decis Mak 2020; 20(S2) (Suppl. 2): 49. doi: 10.1186/s12911-020-1052-0 PMID: 32183788
  21. Mahdaddi A, Meshoul S, Belguidoum M. EA-based hyperparameter optimization of hybrid deep learning models for effective drug-target interactions prediction. Expert Syst Appl 2021; 185: 115525. doi: 10.1016/j.eswa.2021.115525
  22. Luo X, Ju W, Qu M, et al. CLEAR: Cluster-enhanced contrast for self-supervised graph representation learning. IEEE Trans Neural Netw Learn Syst 2022; PP: 1-14. doi: 10.1109/TNNLS.2022.3177775 PMID: 35675236
  23. Ju W, Gu Y, Luo X, et al. Unsupervised graph-level representation learning with hierarchical contrasts. Neural Netw 2023; 158: 359-68. doi: 10.1016/j.neunet.2022.11.019 PMID: 36516542
  24. Gu Z, Luo X, Chen J, Deng M, Lai L. Hierarchical graph transformer with contrastive learning for protein function prediction. Bioinformatics 2023; 39(7): btad410. doi: 10.1093/bioinformatics/btad410 PMID: 37369035
  25. Xia C, Feng SH, Xia Y, Pan X, Shen HB. Leveraging scaffold information to predict protein–ligand binding affinity with an empirical graph neural network. Brief Bioinform 2023; 24(1): bbac603. doi: 10.1093/bib/bbac603 PMID: 36627113
  26. Guo B, Zheng H, Jiang H, et al. Enhanced compound-protein binding affinity prediction by representing protein multimodal information via a coevolutionary strategy. Brief Bioinform 2023; 24(2): bbac628. doi: 10.1093/bib/bbac628 PMID: 36682005
  27. Tsubaki M, Tomii K, Sese J. Compound–protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics 2019; 35(2): 309-18. doi: 10.1093/bioinformatics/bty535 PMID: 29982330
  28. Nguyen T, Le H, Quinn TP, Nguyen T, Le TD, Venkatesh S. GraphDTA: Predicting drug–target binding affinity with graph neural networks. Bioinformatics 2021; 37(8): 1140-7. doi: 10.1093/bioinformatics/btaa921 PMID: 33119053
  29. Jiang M, Li Z, Zhang S, et al. Drug–target affinity prediction using graph neural network and contact maps. RSC Advances 2020; 10(35): 20701-12. doi: 10.1039/D0RA02297G PMID: 35517730
  30. Yang Z, Zhong W, Zhao L, Yu-Chian Chen C. MGraphDTA: Deep multiscale graph neural network for explainable drug–target binding affinity prediction. Chem Sci (Camb) 2022; 13(3): 816-33. doi: 10.1039/D1SC05180F PMID: 35173947
  31. Zhao Q, Zhao H, Zheng K, Wang J. HyperAttentionDTI: Improving drug–protein interaction prediction by sequence-based deep learning with attention mechanism. Bioinformatics 2022; 38(3): 655-62. doi: 10.1093/bioinformatics/btab715 PMID: 34664614
  32. Yazdani-Jahromi M, Yousefi N, Tayebi A, et al. AttentionSiteDTI: An interpretable graph-based model for drug-target interaction prediction using NLP sentence-level relation classification. Brief Bioinform 2022; 23(4): bbac272. doi: 10.1093/bib/bbac272 PMID: 35817396
  33. Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Adv Neural Inf Process Syst 2017; 2017: 5998-6008. doi: 10.5555/3295222.3295349
  34. Maziarka U, Danel T, Mucha S, et al. Molecule attention Transformer arXiv:200208264v1 2021.
  35. Chen L, Tan X, Wang D, et al. TransformerCPI: Improving compound–protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics 2020; 36(16): 4406-14. doi: 10.1093/bioinformatics/btaa524 PMID: 32428219
  36. Huang K, Xiao C, Glass LM, Sun J. MolTrans: Molecular Interaction Transformer for drug–target interaction prediction. Bioinformatics 2021; 37(6): 830-6. doi: 10.1093/bioinformatics/btaa880 PMID: 33070179
  37. Wang JT, Li X, Zhang H. GNN-PT: Enhanced prediction of compound-protein interactions by integrating protein transformer arXiv:200900805 2020.
  38. Kalakoti Y, Yadav S, Sundar D. TransDTI: Transformer-based language models for estimating DTIs and building a drug recommendation workflow. ACS Omega 2022; 7(3): 2706-17. doi: 10.1021/acsomega.1c05203 PMID: 35097268
  39. Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 1988; 28(1): 31-6. doi: 10.1021/ci00057a005
  40. Bento AP, Hersey A, Félix E, et al. An open source chemical structure curation pipeline using RDKit. J Cheminform 2020; 12(1): 51. doi: 10.1186/s13321-020-00456-1 PMID: 33431044
  41. Wu Z, Jiang D, Wang J, Hsieh CY, Cao D, Hou T. Mining toxicity information from large amounts of toxicity data. J Med Chem 2021; 64(10): 6924-36. doi: 10.1021/acs.jmedchem.1c00421 PMID: 33961429
  42. Shen C, Zhang X, Deng Y, et al. Boosting protein-ligand binding pose prediction and virtual screening based on residue-atom distance likelihood potential and graph transformer. J Med Chem 2022; 65(15): 10691-706. doi: 10.1021/acs.jmedchem.2c00991 PMID: 35917397
  43. Zhang S, Jiang M, Wang S, Wang X, Wei Z, Li Z. SAG-DTA: Prediction of drug-target affinity using self-attention graph network. Int J Mol Sci 2021; 22(16): 8993. doi: 10.3390/ijms22168993 PMID: 34445696
  44. Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv:160902907 2016.
  45. Li M, Lu Z, Wu Y, Li Y. BACPI: A bi-directional attention neural network for compound–protein interaction and binding affinity prediction. Bioinformatics 2022; 38(7): 1995-2002. doi: 10.1093/bioinformatics/btac035 PMID: 35043942
  46. Liu H, Sun J, Guan J, Zheng J, Zhou S. Improving compound–protein interaction prediction by building up highly credible negative samples. Bioinformatics 2015; 31(12): i221-9. doi: 10.1093/bioinformatics/btv256 PMID: 26072486
  47. Wishart DS, Knox C, Guo AC, et al. DrugBank: A knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 2008; 36(Database issue) (Suppl. 1): D901-6. doi: 10.1093/nar/gkm958 PMID: 18048412
  48. Günther S, Kuhn M, Dunkel M, et al. SuperTarget and Matador: Resources for exploring drug-target relationships. Nucleic Acids Res 2007; 36(Database): D919-22. doi: 10.1093/nar/gkm862 PMID: 17942422
  49. Kuhn M, Szklarczyk D, Pletscher-Frankild S, et al. STITCH 4: Integration of protein–chemical interactions with user data. Nucleic Acids Res 2014; 42(D1): D401-7. doi: 10.1093/nar/gkt1207 PMID: 24293645
  50. Wu Q, Peng Z, Anishchenko I, Cong Q, Baker D, Yang J. Protein contact prediction using metagenome sequence data and residual neural networks. Bioinformatics 2020; 36(1): 41-8. doi: 10.1093/bioinformatics/btz477 PMID: 31173061
  51. Kingma D, Ba J. Adam: A method for stochastic optimization. arXiv:14126980 2014.
  52. Li P, Li Y, Hsieh CY, et al. TrimNet: Learning molecular representation from triplet messages for biomedicine. Brief Bioinform 2021; 22(4): bbaa266. doi: 10.1093/bib/bbaa266 PMID: 33147620

Supplementary files

Supplementary Files
Action
1. JATS XML

Copyright (c) 2024 Bentham Science Publishers