Open Access
ARTICLE
An Automatic Threshold Selection Using ALO for Healthcare Duplicate Record Detection with Reciprocal Neuro-Fuzzy Inference System
1 Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, KSA.P.O.Box: 84428, 11671
2 Department of Electronics and Communication Engineering, National Institute of Technical Teacher Training and Research, Chandigarh, 160019, India
3 Department of Periodontology, JSS Dental College & Hospital, JSS Academy of Higher Education and Research, Mysuru, 570015, India
4 Department of Computer Science and Engineering, Amity University, Dubai, 345019, United Arab Emirates
5 Department of Management/Economics, Prin LN Welingkar Institute of Management Development and Research, Mumbai, 400019, India
6 Department of Management Studies, Indian Institute of Information Technology, Allahabad, 211012, India
* Corresponding Author: Ala Saleh Alluhaidan. Email:
Computers, Materials & Continua 2023, 74(3), 5821-5836. https://doi.org/10.32604/cmc.2023.033995
Received 03 July 2022; Accepted 28 September 2022; Issue published 28 December 2022
Abstract
ESystems based on EHRs (Electronic health records) have been in use for many years and their amplified realizations have been felt recently. They still have been pioneering collections of massive volumes of health data. Duplicate detections involve discovering records referring to the same practical components, indicating tasks, which are generally dependent on several input parameters that experts yield. Record linkage specifies the issue of finding identical records across various data sources. The similarity existing between two records is characterized based on domain-based similarity functions over different features. De-duplication of one dataset or the linkage of multiple data sets has become a highly significant operation in the data processing stages of different data mining programmes. The objective is to match all the records associated with the same entity. Various measures have been in use for representing the quality and complexity about data linkage algorithms, and many other novel metrics have been introduced. An outline of the problem existing in the measurement of data linkage and de-duplication quality and complexity is presented. This article focuses on the reprocessing of health data that is horizontally divided among data custodians, with the purpose of custodians giving similar features to sets of patients. The first step in this technique is about an automatic selection of training examples with superior quality from the compared record pairs and the second step involves training the reciprocal neuro-fuzzy inference system (RANFIS) classifier. Using the Optimal Threshold classifier, it is presumed that there is information about the original match status for all compared record pairs (i.e., Ant Lion Optimization), and therefore an optimal threshold can be computed based on the respective RANFIS. Febrl, Clinical Decision (CD), and Cork Open Research Archive (CORA) data repository help analyze the proposed method with evaluated benchmarks with current techniques.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.