Open Access iconOpen Access

ARTICLE

crossmark

Using Link-Based Consensus Clustering for Mixed-Type Data Analysis

by Tossapon Boongoen, Natthakan Iam-On*

Center of Excellence in Artificial Intelligence and Emerging Technologies, School of Information Technology, Mae Fah Luang University, Chiang Rai, 57100, Thailand

* Corresponding Author: Natthakan Iam-On. Email: email

(This article belongs to the Special Issue: Emerging Trends in Artificial Intelligence and Machine Learning)

Computers, Materials & Continua 2022, 70(1), 1993-2011. https://doi.org/10.32604/cmc.2022.019776

Abstract

A mix between numerical and nominal data types commonly presents many modern-age data collections. Examples of these include banking data, sales history and healthcare records, where both continuous attributes like age and nominal ones like blood type are exploited to characterize account details, business transactions or individuals. However, only a few standard clustering techniques and consensus clustering methods are provided to examine such a data thus far. Given this insight, the paper introduces novel extensions of link-based cluster ensemble, and that are accurate for analyzing mixed-type data. They promote diversity within an ensemble through different initializations of the k-prototypes algorithm as base clusterings and then refine the summarized data using a link-based approach. Based on the evaluation metric of NMI (Normalized Mutual Information) that is averaged across different combinations of benchmark datasets and experimental settings, these new models reach the improved level of 0.34, while the best model found in the literature obtains only around the mark of 0.24. Besides, parameter analysis included herein helps to enhance their performance even further, given relations of clustering quality and algorithmic variables specific to the underlying link-based models. Moreover, another significant factor of ensemble size is examined in such a way to justify a tradeoff between complexity and accuracy.

Keywords


Cite This Article

APA Style
Boongoen, T., Iam-On, N. (2022). Using link-based consensus clustering for mixed-type data analysis. Computers, Materials & Continua, 70(1), 1993-2011. https://doi.org/10.32604/cmc.2022.019776
Vancouver Style
Boongoen T, Iam-On N. Using link-based consensus clustering for mixed-type data analysis. Comput Mater Contin. 2022;70(1):1993-2011 https://doi.org/10.32604/cmc.2022.019776
IEEE Style
T. Boongoen and N. Iam-On, “Using Link-Based Consensus Clustering for Mixed-Type Data Analysis,” Comput. Mater. Contin., vol. 70, no. 1, pp. 1993-2011, 2022. https://doi.org/10.32604/cmc.2022.019776



cc Copyright © 2022 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1888

    View

  • 1253

    Download

  • 0

    Like

Share Link