Open Access iconOpen Access

ARTICLE

crossmark

Bidirectional Long Short-Term Memory Network for Taxonomic Classification

Naglaa. F. Soliman1,*, Samia M. Abd Alhalem2, Walid El-Shafai2, Salah Eldin S. E. Abdulrahman3, N. Ismaiel3, El-Sayed M. El-Rabaie2, Abeer D. Algarni1, Fatimah Algarni4, Fathi E. Abd El-Samie1,2

1 Department of Information Technology, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia
2 Department of Electronics and Electrical Communications Engineering, Faculty of Electronic Engineering, Menoufia University, Menoufia, 32952, Egypt
3 Department of Computer Science and Engineering, Faculty of Electronic Engineering, Menoufia University, Menoufia, 32952, Egypt
4 Ministry of Education, Riyadh, Saudi Arabia

* Corresponding Author: Naglaa. F. Soliman. Email: email

Intelligent Automation & Soft Computing 2022, 33(1), 103-116. https://doi.org/10.32604/iasc.2022.017691

Abstract

Identifying and classifying Deoxyribonucleic Acid (DNA) sequences and their functions have been considered as the main challenges in bioinformatics. Advances in machine learning and Deep Learning (DL) techniques are expected to improve DNA sequence classification. Since the DNA sequence classification depends on analyzing textual data, Bidirectional Long Short-Term Memory (BLSTM) algorithms are suitable for tackling this task. Generally, classifiers depend on the patterns to be processed and the pre-processing method. This paper is concerned with a new proposed classification framework based on Frequency Chaos Game Representation (FCGR) followed by Discrete Wavelet Transform (DWT) and BLSTM. Firstly, DNA strings are transformed into numerical matrices by FCGR. Then, the DWT is used instead of the pooling layer as a tool of data compression. The benefit of using the DWT is two-fold. It preserves the useful information only that enables the following BLSTM training, effectively. Besides, DWT adds more important details to the encoded sequences due to finding effective features in the DNA fragments. Finally, the BLSTM model is trained to classify the DNA sequences. Evaluation metrics such as F1 score and accuracy show that the proposed framework outperforms the state-of-the-art algorithms. Hence, it can be used in DNA classification applications.

Keywords


Cite This Article

N. F. Soliman, S. M. Abd Alhalem, W. El-Shafai, S. Eldin S. E. Abdulrahman, N. Ismaiel et al., "Bidirectional long short-term memory network for taxonomic classification," Intelligent Automation & Soft Computing, vol. 33, no.1, pp. 103–116, 2022. https://doi.org/10.32604/iasc.2022.017691



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1178

    View

  • 729

    Download

  • 0

    Like

Share Link