Performance Evaluation of Machine Learning Algorithms in Reduced Dimensional Spaces

Kaveh Heidary; Venkata Atluri; John Bland

doi:10.32604/jcs.2024.051196

Open Access icon Open Access

ARTICLE

Performance Evaluation of Machine Learning Algorithms in Reduced Dimensional Spaces

Kaveh Heidary^1,*, Venkata Atluri¹, John Bland²

1 Department of Electrical Engineering and Computer Science, Alabama A&M University, Huntsville, AL, 35811–7500, USA
2 U.S. Army Aviation and Missile Command, Huntsville, AL, 35898–5000, USA

* Corresponding Author: Kaveh Heidary. Email: email

Journal of Cyber Security 2024, 6, 69-87. https://doi.org/10.32604/jcs.2024.051196

Received 29 February 2024; Accepted 02 August 2024; Issue published 28 August 2024

Abstract

This paper investigates the impact of reducing feature-vector dimensionality on the performance of machine learning (ML) models. Dimensionality reduction and feature selection techniques can improve computational efficiency, accuracy, robustness, transparency, and interpretability of ML models. In high-dimensional data, where features outnumber training instances, redundant or irrelevant features introduce noise, hindering model generalization and accuracy. This study explores the effects of dimensionality reduction methods on binary classifier performance using network traffic data for cybersecurity applications. The paper examines how dimensionality reduction techniques influence classifier operation and performance across diverse performance metrics for seven ML models. Four dimensionality reduction methods are evaluated: principal component analysis (PCA), singular value decomposition (SVD), univariate feature selection (UFS) using chi-square statistics, and feature selection based on mutual information (MI). The results suggest that direct feature selection can be more effective than data projection methods in some applications. Direct selection offers lower computational complexity and, in some cases, superior classifier performance. This study emphasizes that evaluation and comparison of binary classifiers depend on specific performance metrics, each providing insights into different aspects of ML model operation. Using open-source network traffic data, this paper demonstrates that dimensionality reduction can be a valuable tool. It reduces computational overhead, enhances model interpretability and transparency, and maintains or even improves the performance of trained classifiers. The study also reveals that direct feature selection can be a more effective strategy when compared to feature engineering in specific scenarios.

Keywords

Machine learning; cybersecurity; feature engineering; dimensionality reduction; feature projection; feature selection; performance metrics

Cite This Article

APA Style

Heidary, K., Atluri, V., Bland, J. (2024). Performance Evaluation of Machine Learning Algorithms in Reduced Dimensional Spaces. Journal of Cyber Security, 6(1), 69–87. https://doi.org/10.32604/jcs.2024.051196

Vancouver Style

Heidary K, Atluri V, Bland J. Performance Evaluation of Machine Learning Algorithms in Reduced Dimensional Spaces. J Cyber Secur. 2024;6(1):69–87. https://doi.org/10.32604/jcs.2024.051196

IEEE Style

K. Heidary, V. Atluri, and J. Bland, “Performance Evaluation of Machine Learning Algorithms in Reduced Dimensional Spaces,” J. Cyber Secur., vol. 6, no. 1, pp. 69–87, 2024. https://doi.org/10.32604/jcs.2024.051196

BibTex EndNote RIS

Copyright © 2024 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Performance Evaluation of Machine Learning Algorithms in Reduced Dimensional Spaces

Abstract

Keywords

Cite This Article

767

468

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link