Open Access
ARTICLE
BDLR: lncRNA identification using ensemble learning
1 Jiangsu Key Lab of Big Data Security & Intelligent Processing School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210046, China
2 Smart Health Big Data Analysis and Location Services Engineering Laboratory of Jiangsu Province, Nanjing, 210046, China
3 Zhejiang Engineering Research Center of Intelligent Medicine, Wenzhou, 325035, China
4 College of Computer Science and Technology, Nanjing Forestry University, Nanjing, 210037, China
* Corresponding Author:LEJUN GONG. Email:
(This article belongs to the Special Issue: Decoding Gene (including circRNA, lincRNA miRNA and mRNA) Expression)
BIOCELL 2022, 46(4), 951-960. https://doi.org/10.32604/biocell.2022.016625
Received 12 March 2021; Accepted 25 May 2021; Issue published 15 December 2021
Abstract
Long non-coding RNAs (lncRNAs) play an important role in many life activities such as epigenetic material regulation, cell cycle regulation, dosage compensation and cell differentiation regulation, and are associated with many human diseases. There are many limitations in identifying and annotating lncRNAs using traditional biological experimental methods. With the development of high-throughput sequencing technology, it is of great practical significance to identify the lncRNAs from massive RNA sequence data using machine learning method. Based on the Bagging method and Decision Tree algorithm in ensemble learning, this paper proposes a method of lncRNAs gene sequence identification called BDLR. The identification results of this classification method are compared with the identification results of several models including Byes, Support Vector Machine, Logical Regression, Decision Tree and Random Forest. The experimental results show that the lncRNAs identification method named BDLR proposed in this paper has an accuracy of 86.61% in the human test set and 90.34% in the mouse for lncRNAs, which is more than the identification results of the other methods. Moreover, the proposed method offers a reference for researchers to identify lncRNAs using the ensemble learning.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.