Open Access
ARTICLE
Sequence-Based Predicting Bacterial Essential ncRNAs Algorithm by Machine Learning
1 Bioinformatics and Biomedical Big data Mining Laboratory, Department of Medical Informatics, School of Big Health, Guizhou Medical University, Guiyang, 550025, China
2 Cells and Antibody Engineering Research Center of Guizhou Province, Key Laboratory of Biology and Medical Engineering, School of Biology and Engineering, Guizhou Medical University, Guiyang, 550025, China
3 Key Laboratory of Environmental Pollution Monitoring and Disease Control, Ministry of Education, Guizhou Medical University, Guiyang, 550025, China
4 College of Computational and Natural Sciences, Dilla University, Dilla, 419, Ethiopia
* Corresponding Authors: Yuan-Nong Ye. Email: ; Zhu Zeng. Email:
Intelligent Automation & Soft Computing 2023, 36(3), 2731-2741. https://doi.org/10.32604/iasc.2023.026761
Received 04 January 2022; Accepted 27 February 2022; Issue published 15 March 2023
Abstract
Essential ncRNA is a type of ncRNA which is indispensable for the survival of organisms. Although essential ncRNAs cannot encode proteins, they are as important as essential coding genes in biology. They have got wide variety of applications such as antimicrobial target discovery, minimal genome construction and evolution analysis. At present, the number of species required for the determination of essential ncRNAs in the whole genome scale is still very few due to the traditional methods are time-consuming, laborious and costly. In addition, traditional experimental methods are limited by the organisms as less than 1% of bacteria can be cultured in the laboratory. Therefore, it is important and necessary to develop theories and methods for the recognition of essential non-coding RNA. In this paper, we present a novel method for predicting essential ncRNA by using both compositional and derivative features calculated by information theory of ncRNA sequences. The method was developed with Support Vector Machine (SVM). The accuracy of the method was evaluated through cross-species cross-validation and found to be between 0.69 and 0.81. It shows that the features we selected have good performance for the prediction of essential ncRNA using SVM. Thus, the method can be applied for discovering essential ncRNAs in bacteria.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.