Open Access
ARTICLE
Ligand Based Virtual Screening of Molecular Compounds in Drug Discovery Using GCAN Fingerprint and Ensemble Machine Learning Algorithm
1 Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, India
2 Department of Mathematics, Amrita Vishwa Vidyapeetham, Coimbatore, India
* Corresponding Author: R. Ani. Email:
Computer Systems Science and Engineering 2023, 47(3), 3033-3048. https://doi.org/10.32604/csse.2023.033807
Received 28 June 2022; Accepted 31 October 2022; Issue published 09 November 2023
Abstract
The drug development process takes a long time since it requires sorting through a large number of inactive compounds from a large collection of compounds chosen for study and choosing just the most pertinent compounds that can bind to a disease protein. The use of virtual screening in pharmaceutical research is growing in popularity. During the early phases of medication research and development, it is crucial. Chemical compound searches are now more narrowly targeted. Because the databases contain more and more ligands, this method needs to be quick and exact. Neural network fingerprints were created more effectively than the well-known Extended Connectivity Fingerprint (ECFP). Only the largest sub-graph is taken into consideration to learn the representation, despite the fact that the conventional graph network generates a better-encoded fingerprint. When using the average or maximum pooling layer, it also contains unrelated data. This article suggested the Graph Convolutional Attention Network (GCAN), a graph neural network with an attention mechanism, to address these problems. Additionally, it makes the nodes or sub-graphs that are used to create the molecular fingerprint more significant. The generated fingerprint is used to classify drugs using ensemble learning. As base classifiers, ensemble stacking is applied to Support Vector Machines (SVM), Random Forest, Nave Bayes, Decision Trees, AdaBoost, and Gradient Boosting. When compared to existing models, the proposed GCAN fingerprint with an ensemble model achieves relatively high accuracy, sensitivity, specificity, and area under the curve. Additionally, it is revealed that our ensemble learning with generated molecular fingerprint yields 91% accuracy, outperforming earlier approaches.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.