Open Access
ARTICLE
Semantic Malware Classification Using Artificial Intelligence Techniques
1 Systems Development Center, Brazilian Army, QGEx, Bloco G, 2° Piso-SMU, Brasilia, 70630-901, DF, Brazil
2 School of Engineering and Technology, International University of La Rioja, Avda. de La Paz, 137, Logroño, 26006, La Rioja, Spain
3 Faculty of Technology and Science, Camilo José Cela University, Castillo de Alarcón 49, Villanueva de la Cañada, Madrid, 28692, Spain
* Corresponding Author: Javier Bermejo Higuera. Email:
(This article belongs to the Special Issue: Emerging Technologies in Information Security )
Computer Modeling in Engineering & Sciences 2025, 142(3), 3031-3067. https://doi.org/10.32604/cmes.2025.061080
Received 16 November 2024; Accepted 08 February 2025; Issue published 03 March 2025
Abstract
The growing threat of malware, particularly in the Portable Executable (PE) format, demands more effective methods for detection and classification. Machine learning-based approaches exhibit their potential but often neglect semantic segmentation of malware files that can improve classification performance. This research applies deep learning to malware detection, using Convolutional Neural Network (CNN) architectures adapted to work with semantically extracted data to classify malware into malware families. Starting from the Malconv model, this study introduces modifications to adapt it to multi-classification tasks and improve its performance. It proposes a new innovative method that focuses on byte extraction from Portable Executable (PE) malware files based on their semantic location, resulting in higher accuracy in malware classification than traditional methods using full-byte sequences. This novel approach evaluates the importance of each semantic segment to improve classification accuracy. The results revealed that the header segment of PE files provides the most valuable information for malware identification, outperforming the other sections, and achieving an average classification accuracy of 99.54%. The above reaffirms the effectiveness of the semantic segmentation approach and highlights the critical role header data plays in improving malware detection and classification accuracy.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.