Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (22)
  • Open Access

    ARTICLE

    Learning Vector Quantization-Based Fuzzy Rules Oversampling Method

    Jiqiang Chen, Ranran Han, Dongqing Zhang, Litao Ma*

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 5067-5082, 2024, DOI:10.32604/cmc.2024.051494

    Abstract Imbalanced datasets are common in practical applications, and oversampling methods using fuzzy rules have been shown to enhance the classification performance of imbalanced data by taking into account the relationship between data attributes. However, the creation of fuzzy rules typically depends on expert knowledge, which may not fully leverage the label information in training data and may be subjective. To address this issue, a novel fuzzy rule oversampling approach is developed based on the learning vector quantization (LVQ) algorithm. In this method, the label information of the training data is utilized to determine the antecedent… More >

  • Open Access

    ARTICLE

    An Imbalanced Data Classification Method Based on Hybrid Resampling and Fine Cost Sensitive Support Vector Machine

    Bo Zhu*, Xiaona Jing, Lan Qiu, Runbo Li

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 3977-3999, 2024, DOI:10.32604/cmc.2024.048062

    Abstract When building a classification model, the scenario where the samples of one class are significantly more than those of the other class is called data imbalance. Data imbalance causes the trained classification model to be in favor of the majority class (usually defined as the negative class), which may do harm to the accuracy of the minority class (usually defined as the positive class), and then lead to poor overall performance of the model. A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article, which is based on a new hybrid… More >

  • Open Access

    ARTICLE

    A Stacked Ensemble Deep Learning Approach for Imbalanced Multi-Class Water Quality Index Prediction

    Wen Yee Wong1, Khairunnisa Hasikin1,*, Anis Salwa Mohd Khairuddin2, Sarah Abdul Razak3, Hanee Farzana Hizaddin4, Mohd Istajib Mokhtar5, Muhammad Mokhzaini Azizan6

    CMC-Computers, Materials & Continua, Vol.76, No.2, pp. 1361-1384, 2023, DOI:10.32604/cmc.2023.038045

    Abstract A common difficulty in building prediction models with realworld environmental datasets is the skewed distribution of classes. There are significantly more samples for day-to-day classes, while rare events such as polluted classes are uncommon. Consequently, the limited availability of minority outcomes lowers the classifier’s overall reliability. This study assesses the capability of machine learning (ML) algorithms in tackling imbalanced water quality data based on the metrics of precision, recall, and F1 score. It intends to balance the misled accuracy towards the majority of data. Hence, 10 ML algorithms of its performance are compared. The classifiers… More >

  • Open Access

    ARTICLE

    Machine Learning and Synthetic Minority Oversampling Techniques for Imbalanced Data: Improving Machine Failure Prediction

    Yap Bee Wah1,5,*, Azlan Ismail1,2, Nur Niswah Naslina Azid3, Jafreezal Jaafar4, Izzatdin Abdul Aziz4, Mohd Hilmi Hasan4, Jasni Mohamad Zain1,2

    CMC-Computers, Materials & Continua, Vol.75, No.3, pp. 4821-4841, 2023, DOI:10.32604/cmc.2023.034470

    Abstract Prediction of machine failure is challenging as the dataset is often imbalanced with a low failure rate. The common approach to handle classification involving imbalanced data is to balance the data using a sampling approach such as random undersampling, random oversampling, or Synthetic Minority Oversampling Technique (SMOTE) algorithms. This paper compared the classification performance of three popular classifiers (Logistic Regression, Gaussian Naïve Bayes, and Support Vector Machine) in predicting machine failure in the Oil and Gas industry. The original machine failure dataset consists of 20,473 hourly data and is imbalanced with 19945 (97%) ‘non-failure’ and… More >

  • Open Access

    ARTICLE

    Fault Diagnosis of Power Transformer Based on Improved ACGAN Under Imbalanced Data

    Tusongjiang. Kari1, Lin Du1, Aisikaer. Rouzi2, Xiaojing Ma1,*, Zhichao Liu1, Bo Li1

    CMC-Computers, Materials & Continua, Vol.75, No.2, pp. 4573-4592, 2023, DOI:10.32604/cmc.2023.037954

    Abstract The imbalance of dissolved gas analysis (DGA) data will lead to over-fitting, weak generalization and poor recognition performance for fault diagnosis models based on deep learning. To handle this problem, a novel transformer fault diagnosis method based on improved auxiliary classifier generative adversarial network (ACGAN) under imbalanced data is proposed in this paper, which meets both the requirements of balancing DGA data and supplying accurate diagnosis results. The generator combines one-dimensional convolutional neural networks (1D-CNN) and long short-term memories (LSTM), which can deeply extract the features from DGA samples and be greatly beneficial to ACGAN’s… More >

  • Open Access

    ARTICLE

    Imbalanced Data Classification Using SVM Based on Improved Simulated Annealing Featuring Synthetic Data Generation and Reduction

    Hussein Ibrahim Hussein1, Said Amirul Anwar2,*, Muhammad Imran Ahmad2

    CMC-Computers, Materials & Continua, Vol.75, No.1, pp. 547-564, 2023, DOI:10.32604/cmc.2023.036025

    Abstract Imbalanced data classification is one of the major problems in machine learning. This imbalanced dataset typically has significant differences in the number of data samples between its classes. In most cases, the performance of the machine learning algorithm such as Support Vector Machine (SVM) is affected when dealing with an imbalanced dataset. The classification accuracy is mostly skewed toward the majority class and poor results are exhibited in the prediction of minority-class samples. In this paper, a hybrid approach combining data pre-processing technique and SVM algorithm based on improved Simulated Annealing (SA) was proposed. Firstly,… More >

  • Open Access

    ARTICLE

    LexDeep: Hybrid Lexicon and Deep Learning Sentiment Analysis Using Twitter for Unemployment-Related Discussions During COVID-19

    Azlinah Mohamed1,3,*, Zuhaira Muhammad Zain2, Hadil Shaiba2,*, Nazik Alturki2, Ghadah Aldehim2, Sapiah Sakri2, Saiful Farik Mat Yatin1, Jasni Mohamad Zain1

    CMC-Computers, Materials & Continua, Vol.75, No.1, pp. 1577-1601, 2023, DOI:10.32604/cmc.2023.034746

    Abstract The COVID-19 pandemic has spread globally, resulting in financial instability in many countries and reductions in the per capita gross domestic product. Sentiment analysis is a cost-effective method for acquiring sentiments based on household income loss, as expressed on social media. However, limited research has been conducted in this domain using the LexDeep approach. This study aimed to explore social trend analytics using LexDeep, which is a hybrid sentiment analysis technique, on Twitter to capture the risk of household income loss during the COVID-19 pandemic. First, tweet data were collected using Twint with relevant keywords… More >

  • Open Access

    ARTICLE

    An Effective Classifier Model for Imbalanced Network Attack Data

    Gürcan Çetin*

    CMC-Computers, Materials & Continua, Vol.73, No.3, pp. 4519-4539, 2022, DOI:10.32604/cmc.2022.031734

    Abstract Recently, machine learning algorithms have been used in the detection and classification of network attacks. The performance of the algorithms has been evaluated by using benchmark network intrusion datasets such as DARPA98, KDD’99, NSL-KDD, UNSW-NB15, and Caida DDoS. However, these datasets have two major challenges: imbalanced data and high-dimensional data. Obtaining high accuracy for all attack types in the dataset allows for high accuracy in imbalanced datasets. On the other hand, having a large number of features increases the runtime load on the algorithms. A novel model is proposed in this paper to overcome these… More >

  • Open Access

    ARTICLE

    MCBC-SMOTE: A Majority Clustering Model for Classification of Imbalanced Data

    Jyoti Arora1, Meena Tushir2, Keshav Sharma1, Lalit Mohan1, Aman Singh3,*, Abdullah Alharbi4, Wael Alosaimi4

    CMC-Computers, Materials & Continua, Vol.73, No.3, pp. 4801-4817, 2022, DOI:10.32604/cmc.2022.025960

    Abstract Datasets with the imbalanced class distribution are difficult to handle with the standard classification algorithms. In supervised learning, dealing with the problem of class imbalance is still considered to be a challenging research problem. Various machine learning techniques are designed to operate on balanced datasets; therefore, the state of the art, different under-sampling, over-sampling and hybrid strategies have been proposed to deal with the problem of imbalanced datasets, but highly skewed datasets still pose the problem of generalization and noise generation during resampling. To over-come these problems, this paper proposes a majority clustering model for… More >

  • Open Access

    ARTICLE

    An Imbalanced Dataset and Class Overlapping Classification Model for Big Data

    Mini Prince1,*, P. M. Joe Prathap2

    Computer Systems Science and Engineering, Vol.44, No.2, pp. 1009-1024, 2023, DOI:10.32604/csse.2023.024277

    Abstract Most modern technologies, such as social media, smart cities, and the internet of things (IoT), rely on big data. When big data is used in the real-world applications, two data challenges such as class overlap and class imbalance arises. When dealing with large datasets, most traditional classifiers are stuck in the local optimum problem. As a result, it’s necessary to look into new methods for dealing with large data collections. Several solutions have been proposed for overcoming this issue. The rapid growth of the available data threatens to limit the usefulness of many traditional methods.… More >

Displaying 1-10 on page 1 of 22. Per Page