Open Access
ARTICLE
Adaptive Density-Based Spatial Clustering of Applications with Noise (ADBSCAN) for Clusters of Different Densities
1 Computer Science Department, Prince Sattam bin Abdulaziz University, Aflaj, Saudi Arabia
2 Computer Science Department, Faculty of Computers and Information, Suez University, Suez, Egypt
* Corresponding Author: Ahmed Fahim. Email: Array
Computers, Materials & Continua 2023, 75(2), 3695-3712. https://doi.org/10.32604/cmc.2023.036820
Received 12 October 2022; Accepted 30 January 2023; Issue published 31 March 2023
Abstract
Finding clusters based on density represents a significant class of clustering algorithms. These methods can discover clusters of various shapes and sizes. The most studied algorithm in this class is the Density-Based Spatial Clustering of Applications with Noise (DBSCAN). It identifies clusters by grouping the densely connected objects into one group and discarding the noise objects. It requires two input parameters: epsilon (fixed neighborhood radius) and MinPts (the lowest number of objects in epsilon). However, it can’t handle clusters of various densities since it uses a global value for epsilon. This article proposes an adaptation of the DBSCAN method so it can discover clusters of varied densities besides reducing the required number of input parameters to only one. Only user input in the proposed method is the MinPts. Epsilon on the other hand, is computed automatically based on statistical information of the dataset. The proposed method finds the core distance for each object in the dataset, takes the average of these distances as the first value of epsilon, and finds the clusters satisfying this density level. The remaining unclustered objects will be clustered using a new value of epsilon that equals the average core distances of unclustered objects. This process continues until all objects have been clustered or the remaining unclustered objects are less than 0.006 of the dataset’s size. The proposed method requires MinPts only as an input parameter because epsilon is computed from data. Benchmark datasets were used to evaluate the effectiveness of the proposed method that produced promising results. Practical experiments demonstrate that the outstanding ability of the proposed method to detect clusters of different densities even if there is no separation between them. The accuracy of the method ranges from 92% to 100% for the experimented datasets.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.