Open Access
ARTICLE
DLBT: Deep Learning-Based Transformer to Generate Pseudo-Code from Source Code
1 Faculty of Computers and Information Sciences, Ain Shams University, Abassia, Cairo, 11566, Egypt
2 College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al Kharj, 11942, Saudi Arabia
* Corresponding Author: Walaa Gad. Email: .e.g
(This article belongs to the Special Issue: Emerging Applications of Artificial Intelligence, Machine learning and Data Science)
Computers, Materials & Continua 2022, 70(2), 3117-3132. https://doi.org/10.32604/cmc.2022.019884
Received 29 April 2021; Accepted 21 June 2021; Issue published 27 September 2021
Abstract
Understanding the content of the source code and its regular expression is very difficult when they are written in an unfamiliar language. Pseudo-code explains and describes the content of the code without using syntax or programming language technologies. However, writing Pseudo-code to each code instruction is laborious. Recently, neural machine translation is used to generate textual descriptions for the source code. In this paper, a novel deep learning-based transformer (DLBT) model is proposed for automatic Pseudo-code generation from the source code. The proposed model uses deep learning which is based on Neural Machine Translation (NMT) to work as a language translator. The DLBT is based on the transformer which is an encoder-decoder structure. There are three major components: tokenizer and embeddings, transformer, and post-processing. Each code line is tokenized to dense vector. Then transformer captures the relatedness between the source code and the matching Pseudo-code without the need of Recurrent Neural Network (RNN). At the post-processing step, the generated Pseudo-code is optimized. The proposed model is assessed using a real Python dataset, which contains more than 18,800 lines of a source code written in Python. The experiments show promising performance results compared with other machine translation methods such as Recurrent Neural Network (RNN). The proposed DLBT records 47.32, 68. 49 accuracy and BLEU performance measures, respectively.Keywords
Cite This Article
Citations
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.