Unsupervised Log Anomaly Detection Method Based on Multi-Feature

Shiming He; Tuo Deng; Bowen Chen; R. Sherratt; Jin Wang

doi:10.32604/cmc.2023.037392

Open Access icon Open Access

ARTICLE

Unsupervised Log Anomaly Detection Method Based on Multi-Feature

Shiming He¹, Tuo Deng¹, Bowen Chen¹, R. Simon Sherratt², Jin Wang^1,*

1 School of Computer & Communication Engineering, Changsha University of Science & Technology, Changsha, 410114, China
2 School of Systems Engineering, The University of Reading, RG6 6AY, UK

* Corresponding Author: Jin Wang. Email: email

Computers, Materials & Continua 2023, 76(1), 517-541. https://doi.org/10.32604/cmc.2023.037392

Received 02 November 2022; Accepted 08 February 2023; Issue published 08 June 2023

Abstract

Log anomaly detection is an important paradigm for system troubleshooting. Existing log anomaly detection based on Long Short-Term Memory (LSTM) networks is time-consuming to handle long sequences. Transformer model is introduced to promote efficiency. However, most existing Transformer-based log anomaly detection methods convert unstructured log messages into structured templates by log parsing, which introduces parsing errors. They only extract simple semantic feature, which ignores other features, and are generally supervised, relying on the amount of labeled data. To overcome the limitations of existing methods, this paper proposes a novel unsupervised log anomaly detection method based on multi-feature (UMFLog). UMFLog includes two sub-models to consider two kinds of features: semantic feature and statistical feature, respectively. UMFLog applies the log original content with detailed parameters instead of templates or template IDs to avoid log parsing errors. In the first sub-model, UMFLog uses Bidirectional Encoder Representations from Transformers (BERT) instead of random initialization to extract effective semantic feature, and an unsupervised hypersphere-based Transformer model to learn compact log sequence representations and obtain anomaly candidates. In the second sub-model, UMFLog exploits a statistical feature-based Variational Autoencoder (VAE) about word occurrence times to identify the final anomaly from anomaly candidates. Extensive experiments and evaluations are conducted on three real public log datasets. The results show that UMFLog significantly improves F1-scores compared to the state-of-the-art (SOTA) methods because of the multi-feature.

Keywords

System log; anomaly detection; semantic features; statistical features; Transformer

Cite This Article

APA Style

He, S., Deng, T., Chen, B., Sherratt, R.S., Wang, J. (2023). Unsupervised Log Anomaly Detection Method Based on Multi-Feature. Computers, Materials & Continua, 76(1), 517–541. https://doi.org/10.32604/cmc.2023.037392

Vancouver Style

He S, Deng T, Chen B, Sherratt RS, Wang J. Unsupervised Log Anomaly Detection Method Based on Multi-Feature. Comput Mater Contin. 2023;76(1):517–541. https://doi.org/10.32604/cmc.2023.037392

IEEE Style

S. He, T. Deng, B. Chen, R. S. Sherratt, and J. Wang, “Unsupervised Log Anomaly Detection Method Based on Multi-Feature,” Comput. Mater. Contin., vol. 76, no. 1, pp. 517–541, 2023. https://doi.org/10.32604/cmc.2023.037392

BibTex EndNote RIS

Copyright © 2023 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Unsupervised Log Anomaly Detection Method Based on Multi-Feature

Abstract

Keywords

Cite This Article

1439

901

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link