Open Access
ARTICLE
An Effective Feature Generation and Selection Approach for Lymph Disease Recognition
1
School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, 210044, China
2
Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, China
* Corresponding Author:Sunil Kr. Jha. Email:
Computer Modeling in Engineering & Sciences 2021, 129(2), 567-594. https://doi.org/10.32604/cmes.2021.016817
Received 29 March 2021; Accepted 15 July 2021; Issue published 08 October 2021
Abstract
Health care data mining is noteworthy in disease diagnosis and recognition procedures. There exist several potentials to further improve the performance of machine learning based-classification methods in healthcare data analysis. The selection of a substantial subset of features is one of the feasible approaches to achieve improved recognition results of classification methods in disease diagnosis prediction. In the present study, a novel combined approach of feature generation using latent semantic analysis (LSA) and selection using ranker search (RAS) has been proposed to improve the performance of classification methods in lymph disease diagnosis prediction. The performance of the proposed combined approach (LSA-RAS) for feature generation and selection is validated using three function-based and two tree-based classification methods. The performance of the LSA-RAS selected features is compared with the original attributes and other subsets of attributes and features chosen by nine different attributes and features selection approaches in the analysis of a most widely used benchmark and open access lymph disease dataset. The LSA-RAS selected features improve the recognition accuracy of the classification methods significantly in the diagnosis prediction of the lymph disease. The tree-based classification methods have better recognition accuracy than the function-based classification methods. The best performance (recognition accuracy of 93.91%) is achieved for the logistic model tree (LMT) classification method using the feature subset generated by the proposed combined approach (LSA-RAS).Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.