Open Access
ARTICLE
Unsupervised Log Anomaly Detection Method Based on Multi-Feature
1 School of Computer & Communication Engineering, Changsha University of Science & Technology, Changsha, 410114, China
2 School of Systems Engineering, The University of Reading, RG6 6AY, UK
* Corresponding Author: Jin Wang. Email:
Computers, Materials & Continua 2023, 76(1), 517-541. https://doi.org/10.32604/cmc.2023.037392
Received 02 November 2022; Accepted 08 February 2023; Issue published 08 June 2023
Abstract
Log anomaly detection is an important paradigm for system troubleshooting. Existing log anomaly detection based on Long Short-Term Memory (LSTM) networks is time-consuming to handle long sequences. Transformer model is introduced to promote efficiency. However, most existing Transformer-based log anomaly detection methods convert unstructured log messages into structured templates by log parsing, which introduces parsing errors. They only extract simple semantic feature, which ignores other features, and are generally supervised, relying on the amount of labeled data. To overcome the limitations of existing methods, this paper proposes a novel unsupervised log anomaly detection method based on multi-feature (UMFLog). UMFLog includes two sub-models to consider two kinds of features: semantic feature and statistical feature, respectively. UMFLog applies the log original content with detailed parameters instead of templates or template IDs to avoid log parsing errors. In the first sub-model, UMFLog uses Bidirectional Encoder Representations from Transformers (BERT) instead of random initialization to extract effective semantic feature, and an unsupervised hypersphere-based Transformer model to learn compact log sequence representations and obtain anomaly candidates. In the second sub-model, UMFLog exploits a statistical feature-based Variational Autoencoder (VAE) about word occurrence times to identify the final anomaly from anomaly candidates. Extensive experiments and evaluations are conducted on three real public log datasets. The results show that UMFLog significantly improves F1-scores compared to the state-of-the-art (SOTA) methods because of the multi-feature.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.