Open Access iconOpen Access

ARTICLE

Ensemble Filter-Wrapper Text Feature Selection Methods for Text Classification

Oluwaseun Peter Ige1,2, Keng Hoon Gan1,*

1 School of Computer Sciences, Universiti Sains Malaysia, Gelugor, 11800, Malaysia
2 Universal Basic Education Commission, Abuja, 900284, Nigeria

* Corresponding Author: Keng Hoon Gan. Email: email

(This article belongs to the Special Issue: Lightweight Methods and Resource-efficient Computing Solutions)

Computer Modeling in Engineering & Sciences 2024, 141(2), 1847-1865. https://doi.org/10.32604/cmes.2024.053373

Abstract

Feature selection is a crucial technique in text classification for improving the efficiency and effectiveness of classifiers or machine learning techniques by reducing the dataset’s dimensionality. This involves eliminating irrelevant, redundant, and noisy features to streamline the classification process. Various methods, from single feature selection techniques to ensemble filter-wrapper methods, have been used in the literature. Metaheuristic algorithms have become popular due to their ability to handle optimization complexity and the continuous influx of text documents. Feature selection is inherently multi-objective, balancing the enhancement of feature relevance, accuracy, and the reduction of redundant features. This research presents a two-fold objective for feature selection. The first objective is to identify the top-ranked features using an ensemble of three multi-univariate filter methods: Information Gain (Infogain), Chi-Square (Chi2), and Analysis of Variance (ANOVA). This aims to maximize feature relevance while minimizing redundancy. The second objective involves reducing the number of selected features and increasing accuracy through a hybrid approach combining Artificial Bee Colony (ABC) and Genetic Algorithms (GA). This hybrid method operates in a wrapper framework to identify the most informative subset of text features. Support Vector Machine (SVM) was employed as the performance evaluator for the proposed model, tested on two high-dimensional multiclass datasets. The experimental results demonstrated that the ensemble filter combined with the ABC+GA hybrid approach is a promising solution for text feature selection, offering superior performance compared to other existing feature selection algorithms.

Keywords


Cite This Article

APA Style
Ige, O.P., Gan, K.H. (2024). Ensemble filter-wrapper text feature selection methods for text classification. Computer Modeling in Engineering & Sciences, 141(2), 1847-1865. https://doi.org/10.32604/cmes.2024.053373
Vancouver Style
Ige OP, Gan KH. Ensemble filter-wrapper text feature selection methods for text classification. Comput Model Eng Sci. 2024;141(2):1847-1865 https://doi.org/10.32604/cmes.2024.053373
IEEE Style
O.P. Ige and K.H. Gan, “Ensemble Filter-Wrapper Text Feature Selection Methods for Text Classification,” Comput. Model. Eng. Sci., vol. 141, no. 2, pp. 1847-1865, 2024. https://doi.org/10.32604/cmes.2024.053373



cc Copyright © 2024 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 392

    View

  • 162

    Download

  • 0

    Like

Share Link