Open Access
ARTICLE
LEGF-DST: LLMs-Enhanced Graph-Fusion Dual-Stream Transformer for Fine-Grained Chinese Malicious SMS Detection
1 School of Information and Cybersecurity, People’s Public Security University of China, Beijing, 100038, China
2 Cyber Investigation Technology Research and Development Center, The Third Research Institute of the Ministry of Public Security, Shanghai, 201204, China
3 Department of Cybersecurity Defense, Beijing Police College, Beijing, 102202, China
4 School of Computer Science, Henan Institute of Engineering, Zhengzhou, 451191, China
* Corresponding Author: Jingya Wang. Email:
Computers, Materials & Continua 2025, 82(2), 1901-1924. https://doi.org/10.32604/cmc.2024.059018
Received 26 September 2024; Accepted 20 November 2024; Issue published 17 February 2025
Abstract
With the widespread use of SMS (Short Message Service), the proliferation of malicious SMS has emerged as a pressing societal issue. While deep learning-based text classifiers offer promise, they often exhibit suboptimal performance in fine-grained detection tasks, primarily due to imbalanced datasets and insufficient model representation capabilities. To address this challenge, this paper proposes an LLMs-enhanced graph fusion dual-stream Transformer model for fine-grained Chinese malicious SMS detection. During the data processing stage, Large Language Models (LLMs) are employed for data augmentation, mitigating dataset imbalance. In the data input stage, both word-level and character-level features are utilized as model inputs, enhancing the richness of features and preventing information loss. A dual-stream Transformer serves as the backbone network in the learning representation stage, complemented by a graph-based feature fusion mechanism. At the output stage, both supervised classification cross-entropy loss and supervised contrastive learning loss are used as multi-task optimization objectives, further enhancing the model’s feature representation. Experimental results demonstrate that the proposed method significantly outperforms baselines on a publicly available Chinese malicious SMS dataset.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.