Open Access iconOpen Access

ARTICLE

crossmark

DLBT: Deep Learning-Based Transformer to Generate Pseudo-Code from Source Code

Walaa Gad1,*, Anas Alokla1, Waleed Nazih2, Mustafa Aref1, Abdel-badeeh Salem1

1 Faculty of Computers and Information Sciences, Ain Shams University, Abassia, Cairo, 11566, Egypt
2 College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al Kharj, 11942, Saudi Arabia

* Corresponding Author: Walaa Gad. Email: email.e.g

(This article belongs to this Special Issue: Emerging Applications of Artificial Intelligence, Machine learning and Data Science)

Computers, Materials & Continua 2022, 70(2), 3117-3132. https://doi.org/10.32604/cmc.2022.019884

Abstract

Understanding the content of the source code and its regular expression is very difficult when they are written in an unfamiliar language. Pseudo-code explains and describes the content of the code without using syntax or programming language technologies. However, writing Pseudo-code to each code instruction is laborious. Recently, neural machine translation is used to generate textual descriptions for the source code. In this paper, a novel deep learning-based transformer (DLBT) model is proposed for automatic Pseudo-code generation from the source code. The proposed model uses deep learning which is based on Neural Machine Translation (NMT) to work as a language translator. The DLBT is based on the transformer which is an encoder-decoder structure. There are three major components: tokenizer and embeddings, transformer, and post-processing. Each code line is tokenized to dense vector. Then transformer captures the relatedness between the source code and the matching Pseudo-code without the need of Recurrent Neural Network (RNN). At the post-processing step, the generated Pseudo-code is optimized. The proposed model is assessed using a real Python dataset, which contains more than 18,800 lines of a source code written in Python. The experiments show promising performance results compared with other machine translation methods such as Recurrent Neural Network (RNN). The proposed DLBT records 47.32, 68. 49 accuracy and BLEU performance measures, respectively.

Keywords


Cite This Article

W. Gad, A. Alokla, W. Nazih, M. Aref and A. Salem, "Dlbt: deep learning-based transformer to generate pseudo-code from source code," Computers, Materials & Continua, vol. 70, no.2, pp. 3117–3132, 2022.

Citations




cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2134

    View

  • 1960

    Download

  • 0

    Like

Share Link