Tech Science Press - Publisher of Open Access Journals

Open Access

ARTICLE

Data-Driven Decision-Making for Bank Target Marketing Using Supervised Learning Classifiers on Imbalanced Big Data

Fahim Nasir¹, Abdulghani Ali Ahmed^1,*, Mehmet Sabir Kiraz¹, Iryna Yevseyeva¹, Mubarak Saif²

CMC-Computers, Materials & Continua, Vol.81, No.1, pp. 1703-1728, 2024, DOI:10.32604/cmc.2024.055192 - 15 October 2024

Abstract Integrating machine learning and data mining is crucial for processing big data and extracting valuable insights to enhance decision-making. However, imbalanced target variables within big data present technical challenges that hinder the performance of supervised learning classifiers on key evaluation metrics, limiting their overall effectiveness. This study presents a comprehensive review of both common and recently developed Supervised Learning Classifiers (SLCs) and evaluates their performance in data-driven decision-making. The evaluation uses various metrics, with a particular focus on the Harmonic Mean Score (F-1 score) on an imbalanced real-world bank target marketing dataset. The findings indicate… More >

Open Access

ARTICLE

A Novel Framework for Learning and Classifying the Imbalanced Multi-Label Data

P. K. A. Chitra¹, S. Appavu alias Balamurugan², S. Geetha³, Seifedine Kadry^4,5,6, Jungeun Kim^7,*, Keejun Han⁸

Computer Systems Science and Engineering, Vol.48, No.5, pp. 1367-1385, 2024, DOI:10.32604/csse.2023.034373 - 13 September 2024

Abstract A generalization of supervised single-label learning based on the assumption that each sample in a dataset may belong to more than one class simultaneously is called multi-label learning. The main objective of this work is to create a novel framework for learning and classifying imbalanced multi-label data. This work proposes a framework of two phases. The imbalanced distribution of the multi-label dataset is addressed through the proposed Borderline MLSMOTE resampling method in phase 1. Later, an adaptive weighted l₂₁ norm regularized (Elastic-net) multi-label logistic regression is used to predict unseen samples in phase 2. The proposed… More >

Open Access

ARTICLE

Cost-Sensitive Dual-Stream Residual Networks for Imbalanced Classification

Congcong Ma^1,2, Jiaqi Mi¹, Wanlin Gao^1,2, Sha Tao^1,2,*

CMC-Computers, Materials & Continua, Vol.80, No.3, pp. 4243-4261, 2024, DOI:10.32604/cmc.2024.054506 - 12 September 2024

Abstract Imbalanced data classification is the task of classifying datasets where there is a significant disparity in the number of samples between different classes. This task is prevalent in practical scenarios such as industrial fault diagnosis, network intrusion detection, cancer detection, etc. In imbalanced classification tasks, the focus is typically on achieving high recognition accuracy for the minority class. However, due to the challenges presented by imbalanced multi-class datasets, such as the scarcity of samples in minority classes and complex inter-class relationships with overlapping boundaries, existing methods often do not perform well in multi-class imbalanced data… More >

Open Access

ARTICLE

Learning Vector Quantization-Based Fuzzy Rules Oversampling Method

Jiqiang Chen, Ranran Han, Dongqing Zhang, Litao Ma^*

CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 5067-5082, 2024, DOI:10.32604/cmc.2024.051494 - 20 June 2024

Abstract Imbalanced datasets are common in practical applications, and oversampling methods using fuzzy rules have been shown to enhance the classification performance of imbalanced data by taking into account the relationship between data attributes. However, the creation of fuzzy rules typically depends on expert knowledge, which may not fully leverage the label information in training data and may be subjective. To address this issue, a novel fuzzy rule oversampling approach is developed based on the learning vector quantization (LVQ) algorithm. In this method, the label information of the training data is utilized to determine the antecedent… More >

Open Access

ARTICLE

An Imbalanced Data Classification Method Based on Hybrid Resampling and Fine Cost Sensitive Support Vector Machine

Bo Zhu^*, Xiaona Jing, Lan Qiu, Runbo Li

CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 3977-3999, 2024, DOI:10.32604/cmc.2024.048062 - 20 June 2024

Abstract When building a classification model, the scenario where the samples of one class are significantly more than those of the other class is called data imbalance. Data imbalance causes the trained classification model to be in favor of the majority class (usually defined as the negative class), which may do harm to the accuracy of the minority class (usually defined as the positive class), and then lead to poor overall performance of the model. A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article, which is based on a new hybrid… More >

Open Access

ARTICLE

A Stacked Ensemble Deep Learning Approach for Imbalanced Multi-Class Water Quality Index Prediction

Wen Yee Wong¹, Khairunnisa Hasikin^1,*, Anis Salwa Mohd Khairuddin², Sarah Abdul Razak³, Hanee Farzana Hizaddin⁴, Mohd Istajib Mokhtar⁵, Muhammad Mokhzaini Azizan⁶

CMC-Computers, Materials & Continua, Vol.76, No.2, pp. 1361-1384, 2023, DOI:10.32604/cmc.2023.038045 - 30 August 2023

Abstract A common difficulty in building prediction models with realworld environmental datasets is the skewed distribution of classes. There are significantly more samples for day-to-day classes, while rare events such as polluted classes are uncommon. Consequently, the limited availability of minority outcomes lowers the classifier’s overall reliability. This study assesses the capability of machine learning (ML) algorithms in tackling imbalanced water quality data based on the metrics of precision, recall, and F1 score. It intends to balance the misled accuracy towards the majority of data. Hence, 10 ML algorithms of its performance are compared. The classifiers… More >

Open Access

ARTICLE

Machine Learning and Synthetic Minority Oversampling Techniques for Imbalanced Data: Improving Machine Failure Prediction

Yap Bee Wah^1,5,*, Azlan Ismail^1,2, Nur Niswah Naslina Azid³, Jafreezal Jaafar⁴, Izzatdin Abdul Aziz⁴, Mohd Hilmi Hasan⁴, Jasni Mohamad Zain^1,2

CMC-Computers, Materials & Continua, Vol.75, No.3, pp. 4821-4841, 2023, DOI:10.32604/cmc.2023.034470 - 29 April 2023

Abstract Prediction of machine failure is challenging as the dataset is often imbalanced with a low failure rate. The common approach to handle classification involving imbalanced data is to balance the data using a sampling approach such as random undersampling, random oversampling, or Synthetic Minority Oversampling Technique (SMOTE) algorithms. This paper compared the classification performance of three popular classifiers (Logistic Regression, Gaussian Naïve Bayes, and Support Vector Machine) in predicting machine failure in the Oil and Gas industry. The original machine failure dataset consists of 20,473 hourly data and is imbalanced with 19945 (97%) ‘non-failure’ and… More >

Open Access

ARTICLE

Fault Diagnosis of Power Transformer Based on Improved ACGAN Under Imbalanced Data

Tusongjiang. Kari¹, Lin Du¹, Aisikaer. Rouzi², Xiaojing Ma^1,*, Zhichao Liu¹, Bo Li¹

CMC-Computers, Materials & Continua, Vol.75, No.2, pp. 4573-4592, 2023, DOI:10.32604/cmc.2023.037954 - 31 March 2023

Abstract The imbalance of dissolved gas analysis (DGA) data will lead to over-fitting, weak generalization and poor recognition performance for fault diagnosis models based on deep learning. To handle this problem, a novel transformer fault diagnosis method based on improved auxiliary classifier generative adversarial network (ACGAN) under imbalanced data is proposed in this paper, which meets both the requirements of balancing DGA data and supplying accurate diagnosis results. The generator combines one-dimensional convolutional neural networks (1D-CNN) and long short-term memories (LSTM), which can deeply extract the features from DGA samples and be greatly beneficial to ACGAN’s… More >

Open Access

ARTICLE

Imbalanced Data Classification Using SVM Based on Improved Simulated Annealing Featuring Synthetic Data Generation and Reduction

Hussein Ibrahim Hussein¹, Said Amirul Anwar^2,*, Muhammad Imran Ahmad²

CMC-Computers, Materials & Continua, Vol.75, No.1, pp. 547-564, 2023, DOI:10.32604/cmc.2023.036025 - 06 February 2023

Abstract Imbalanced data classification is one of the major problems in machine learning. This imbalanced dataset typically has significant differences in the number of data samples between its classes. In most cases, the performance of the machine learning algorithm such as Support Vector Machine (SVM) is affected when dealing with an imbalanced dataset. The classification accuracy is mostly skewed toward the majority class and poor results are exhibited in the prediction of minority-class samples. In this paper, a hybrid approach combining data pre-processing technique and SVM algorithm based on improved Simulated Annealing (SA) was proposed. Firstly,… More >

Open Access

ARTICLE

LexDeep: Hybrid Lexicon and Deep Learning Sentiment Analysis Using Twitter for Unemployment-Related Discussions During COVID-19

Azlinah Mohamed^1,3,*, Zuhaira Muhammad Zain², Hadil Shaiba^2,*, Nazik Alturki², Ghadah Aldehim², Sapiah Sakri², Saiful Farik Mat Yatin¹, Jasni Mohamad Zain¹

CMC-Computers, Materials & Continua, Vol.75, No.1, pp. 1577-1601, 2023, DOI:10.32604/cmc.2023.034746 - 06 February 2023

Abstract The COVID-19 pandemic has spread globally, resulting in financial instability in many countries and reductions in the per capita gross domestic product. Sentiment analysis is a cost-effective method for acquiring sentiments based on household income loss, as expressed on social media. However, limited research has been conducted in this domain using the LexDeep approach. This study aimed to explore social trend analytics using LexDeep, which is a hybrid sentiment analysis technique, on Twitter to capture the risk of household income loss during the COVID-19 pandemic. First, tweet data were collected using Twint with relevant keywords… More >

Displaying 1-10 on page 1 of 25. Per Page

View

344

Download

190

View

395

Download

124

View

320

Download

179

View

329

Download

169

View

535

Download

203

View

1380

Download

935

View

1099

Download

542

View

1127

Download

1134

View

1575

Download

628

View

1362

Download

527

Like

1

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: