Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.060739
Special Issues
Table of Content

Open Access

ARTICLE

DCS-SOCP-SVM: A Novel Integrated Sampling and Classification Algorithm for Imbalanced Datasets

Xuewen Mu*, Bingcong Zhao
School of Mathematics and Statistics, Xidian University, Xi’an, 710071, China
* Corresponding Author: Xuewen Mu. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.060739

Received 08 November 2024; Accepted 28 January 2025; Published online 20 March 2025

Abstract

When dealing with imbalanced datasets, the traditional support vector machine (SVM) tends to produce a classification hyperplane that is biased towards the majority class, which exhibits poor robustness. This paper proposes a high-performance classification algorithm specifically designed for imbalanced datasets. The proposed method first uses a biased second-order cone programming support vector machine (B-SOCP-SVM) to identify the support vectors (SVs) and non-support vectors (NSVs) in the imbalanced data. Then, it applies the synthetic minority over-sampling technique (SV-SMOTE) to oversample the support vectors of the minority class and uses the random under-sampling technique (NSV-RUS) multiple times to undersample the non-support vectors of the majority class. Combining the above-obtained minority class data set with multiple majority class datasets can obtain multiple new balanced data sets. Finally, SOCP-SVM is used to classify each data set, and the final result is obtained through the integrated algorithm. Experimental results demonstrate that the proposed method performs excellently on imbalanced datasets.

Keywords

DCS-SOCP-SVM; imbalanced datasets; sampling method; ensemble method; integrated algorithm
  • 62

    View

  • 9

    Download

  • 0

    Like

Share Link