Computers, Materials & Continua DOI:10.32604/cmc.2021.014604 | |
Review |
Medical Diagnosis Using Machine Learning: A Statistical Review
1Lovely Professional University, Jalandhar, 144411, India
2Department of Information Systems, Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
3Software Department, Sejong University, Seoul, 05006, Korea
4Department of Information and Communication Engineering, Yeungnam University, Gyeongsan, 38541, Korea
5Department of Computing and Mathematics, Manchester Metropolitan University, Manchester, M15 6BH, UK
*Corresponding Author: Oh-Young Song. Email: oysong@sejong.edu
Received: 02 October 2020; Accepted: 20 October 2020
Abstract: Decision making in case of medical diagnosis is a complicated process. A large number of overlapping structures and cases, and distractions, tiredness, and limitations with the human visual system can lead to inappropriate diagnosis. Machine learning (ML) methods have been employed to assist clinicians in overcoming these limitations and in making informed and correct decisions in disease diagnosis. Many academic papers involving the use of machine learning for disease diagnosis have been increasingly getting published. Hence, to determine the use of ML to improve the diagnosis in varied medical disciplines, a systematic review is conducted in this study. To carry out the review, six different databases are selected. Inclusion and exclusion criteria are employed to limit the research. Further, the eligible articles are classified depending on publication year, authors, type of articles, research objective, inputs and outputs, problem and research gaps, and findings and results. Then the selected articles are analyzed to show the impact of ML methods in improving the disease diagnosis. The findings of this study show the most used ML methods and the most common diseases that are focused on by researchers. It also shows the increase in use of machine learning for disease diagnosis over the years. These results will help in focusing on those areas which are neglected and also to determine various ways in which ML methods could be employed to achieve desirable results.
Keywords: Decision making; disease diagnosis; machine learning; medical disciplines
Diagnosis is a way to classify medicines that are fundamental to how a medicine performs its part in society. It is central to the medical system. It organizes disease: defining care choices, forecasting results, and offering an informative mechanism [1]. Appropriate and effective treatment usually involves a thorough diagnosis. The accompanying improvements in diagnostic testing and imaging have certainly improved the entire process of diagnosis. But the human method of scientific judgment leading to correct diagnosis remains key to superior quality and healthy medical services even in this era of rapid technical transition [2]. However, the diagnostic error that harms patient does happen frequently. Generally, multiple factors give rise to diagnostic errors, usually including both perceptual and system-related causes. Certain common factors involve misjudging the significance of observations, misinterpretation, errors originating from heuristics usage, and errors in judgment, particularly when diagnostic hypotheses are developed and assessed [3–5]. Since treatment options are becoming efficient and expensive, the health and financial risk of misdiagnosing an easily curable illness is significantly greater. Thus, there is a loss in improved patient care [6].
These diagnostic errors could be minimized using techniques like fuzzy logic [7], or machine learning (ML) and thus could improve healthcare services. The kind of analytics a clinician can get using ML, at the time of patient treatment, can provide them with more knowledge and, thus, better care [8]. ML tackles the concern of how these systems can be designed that develop with experience continuously. It is known as one of the fastest-growing technical disciplines of today, standing at the junction of computing and analytics and at the heart of artificial intelligence (AI) and data science [9]. Till today, the primary winners of the 21st-century boom in the development of big data, ML, and data science are markets that have been able to obtain such data and employ the workers needed to turn their products. The algorithms built in and around these markets provide considerable potential around improving research in medical and clinical care, particularly provided that clinicians are widely using electronic health records (EHR). Diagnosis and outcome estimation are two fields that gain from the use of ML techniques in the healthcare sector [10]. ML can not only handle varying raw data combinations and apply context weighting but also measure the predictive capacity of any possible combination of factors for determining diagnostic and prognostic components [11]. For example, assisting clinicians for ‘second opinion,’ as based on clinical data, ML models can diagnose aphasia speech type [12], urinary tract infection [13], or even predicting breast cancer [14], among others. The capability to process large data sets far beyond limits of human abilities, and then to efficiently process that data into clinical knowledge that enables doctors to prepare and deliver treatment, eventually leading to improved results, lower medical costs, and enhanced patient satisfaction. ML has the capability and is currently behind the creation of guidelines for precision medicines, treatment counsel, and disease diagnosis [8]. Utilization of these capabilities of ML can even be seen in healthcare internet of things (H-IoT); to analyze and process massive amount of healthcare data generated through sensors [15]. Therefore, extensive research in the context of treatment for specific diseases has been conducted for its usefulness. Hence, the main aim of this paper is to analyze the experiments in which ML approaches are used in relation to different medical fields and diseases to determine their pattern and usefulness in the diagnosis of disease, through a systematic analysis. Tab. 1 represents the uniqueness of our paper using comparative analysis with other published review papers in the medical domain. This paper provides in-depth analysis and results of the use of ML in disease diagnosis. This research paper provides detailed analysis, covering all the major medical domains to the best of our knowledge.
1.1 Comparison with Related Survey Articles
Yanase et al. [16] presented a survey from a computer-aided diagnostic (CAD) system perspective in medicine. The article covers the in-depth workflow of CAD systems and their history. The paper also represents applications in the medical domain from a data type perspective, including tabular, imaging, sound, and signal types of data. Caballe et al. [17] detailed the benefits and limitations of using different ML methods in disease diagnosis. The paper covers classification, regression, and clustering techniques. However, it does not include a summarization of literature and an in-depth analysis of reviewed articles. Jiang et al. [18] surveyed research articles in healthcare from AI perspective. In addition to ML, the paper also covers natural language processing techniques applied in healthcare. The paper covers only three medical domains: cancer, neurology, and cardiology. Schaefer [19] presented an overview of the application of ML in rare diseases. It reviews articles in healthcare covering diagnosis, prognosis, and treatment. None of the articles summarizes the existing work. Also, they cover only a few medical domains and do not provide an in-depth analysis of reviewed articles.
1.2 The Process of Applying Machine Learning in Disease Diagnosis
Medical diagnosis is a complex task largely considered as an empirical task but understood poorly as a cognitive task [20]. Thus, as complex as it may be seen, diagnosis using a computer, i.e., using ML in our case, is divided into multiple steps. The first step of disease diagnosis is data acquisition. This data could be in varied forms, including but not limited to medical interview, clinical, demographic, imaging, speech, patient historical data, or even heart sound [21–23]. The next step involves processing. In this, the data is prepared, i.e., missing values, dimensionality reduction, dealing with noisy data, and so on is made in this step [24,25]. Next, the target variable and the predictors are identified. This data is then fed to one of the models for training. Once the model is trained, it is then used for diagnosis.
1.3 The Benefits of Using Machine Learning for Disease Diagnosis
Limitations posed due to a large number of overlapping structures and cases, and distractions, tiredness, and limitations with the human visual system, provision of ‘second opinion’ can come handy [26]. This has encouraged the use of CAD systems for diagnostic processes. CAD is a concept that gives equal roles to physicians as well as to the computers, i.e., it assists the physicians in taking the best clinical decisions/practices [27]. Moreover, due to increasing complexity among patients, high diagnostic errors, and availability of a large amount of data, EHR systems are being used to assist in making the clinical decision [28].
With the availability of intelligent tools for data analysis, ML methods help in demystifying interesting relationships in the data [29]. As a second opinion, it could corroborate with clinicians’ decisions or refute it [30]. Integration of ML based tools that monitor continuously increasing volume of data streams for patterns, assisting in decision making for clinicians, or automatically adjusting settings of bedside devices have improved outcomes of patient treatment and substantially reduced the overall cost of treatment [31,32]. On the downside, ML promises to provide the best clinical assistance but so far has not proven useful, according to the article [33,34], probably due to opacity in ML algorithms and analytics. Moreover, data quality and generalizability of the ML models remain amongst the other problems [35,36].
The paper is organized as follows. Section 2 proposes the methodology employed to carry out this study. It discusses the database chosen and eligibility criteria for the selection of papers. In Section 3, we present the analysis and synthesis of the eligible papers. A discussion of the analysis done is discussed in Section 4. Finally, we draw a conclusion. Fig. 1 presents the taxonomy of this article. Abbreviations and their corresponding full forms used in this article are presented in Tab. 2.
Methodology, in which the author finds relevant studies, selects and investigates those studies, analyzes the data, and summarizes the findings to reach precise conclusions, is called systematic review [37,38]. The use of evidence from dependable research to make healthcare decisions facilitates the use of best practices with lesser mistakes for clinical decision making. Hence, systematic reviews, as well as clinical practice, are considered as the finest source of evidence [39]. The following section includes literature search, study selection, and eligible papers, and extraction and analyzation of data.
To select relevant and eligible papers for systematic review, six databases were selected in this step.
These databases were: IEEE, PubMed, Science Direct, SciPub, Springer Link, and Web of Science. The articles searched were from the year 2015 up to now. Phrases and keywords such as “disease diagnosis,” “disease diagnosis using machine learning,” “Chronic kidney diagnosis using machine learning,” “Parkinson diagnosis using machine learning,” etc. were used to find relevant articles. The articles were filtered based on relevancy and publication date. From our eligible papers selected, the frequency and number of articles published by publishers are shown in Tab. 3. Accordingly, with 22.73% Elsevier had the highest number of publications. BMC, Hindawi, IEEE, Public Library of Science, and Springer stood second with 6.82% of publications. Nature was ranked third with 4.55% of publications. In comparison, the rest of the publishers ranked fourth with 2.27% of publications each.
2.2 Study Selection and Eligible Papers
Inclusion and exclusion criteria were used to select appropriate and relevant articles. The research only concentrates on disease diagnosis using ML. It excludes paper using fuzzy logic or image processing. Accordingly, articles were screened for selection based on their title and abstract. Only journal and conference papers were considered. Books, book chapters, thesis, reports, review articles, and letters to editors were thus excluded from our research.
Language, time and article qualities were considered for eligible papers. Thus we selected papers written only in the English language and published from the year 2015 up to now. Our research was focused on including all kinds of medical disciplines. However, diseases related to animals and plants were excluded from it. According to our inclusion criteria, articles using methods and techniques that improved the accuracy of disease diagnosis were included.
2.3 Extraction and Analyzation of Data
The included articles were examined to extract and analyze the data with respect to our research objectives. Thus, to meet our objectives, we analyzed the articles according to frequency of articles over the past years, type of academic papers, according to database providers, and according to ML model employed in those articles.
The following section represents the findings and results of the analysis and synthesis of the included articles. This result, which is the outcome of a systematic study of the papers, shows the efficiency of applying ML in disease diagnosis. In the following section, the impact of ML and its use in different medical disciplines is studied.
3.1 The Frequency of Published Articles over the Past Years
Our research includes 44 academic papers that met our inclusion criteria. These 44 papers include research papers as well as conference papers. The frequency of published articles is shown in Fig. 2. The articles included are taken from the year 2015 up till now. The graph indicates that since 2015 there has been a significant rise in published articles. This shows that the research for disease diagnosis using ML has been increasing. In fact, from the included articles, almost 40% were published in the year 2019. Hence, it is evident that researchers are showing interest in applying ML techniques in disease diagnosis.
3.2 Distribution of Academic Papers by Journal and Conference Type
Total articles included in this systematic review, including from journals and conferences, are 35 (i.e., 27 from journals and 8 from conference papers). Fig. 3 represents the distribution of papers by publication year and type. As seen in this figure, journal articles published are comparatively higher than conference papers. As we find no conference paper in the year 2015 and 2016 in our chart, we can say that overall fewer articles must have been published in conferences than in journals. However, during 2017 there is a considerable increase in articles published in conferences.
Eligible articles have been categorized by journals and conferences. The distribution of papers by journals is represented in Tab. 4 and by conferences in Tab. 5. From our reviewed articles, almost 81.82% of articles are of journals, and 18.18% of articles are from conferences. ‘Computer Methods and Programs in Biomedicine,’ ‘Computers in Biology and Medicine,’ ‘IEEE Access’ and ‘PLoS ONE’ journals published 6 articles each, which were the highest and concentrated of around 6.82% each. On average, 2.27% of articles were published by each journal.
3.3 Distribution of Papers by Database Providers
Searching and selection of papers were made through 6 different databases. These databases and their contribution can be seen in Tab. 6. With 40.91% PubMed was ranked first. It concentrated on 18 papers. Moreover, IEEE was ranked second with 25.00%. Science Direct was ranked third with 22.73%. Springer Link and Web of Science were ranked fourth with 4.55% each. And SciPub was ranked fifth with 2.27%.
3.4 The Distribution of Machine Learning Methods Applied in Published Articles
The objective of this study is to carry out a systematic review of the use of ML methods in disease diagnosis. Also, the distribution of various ML methods for diagnosis is analyzed. From these eligible articles, we can observe that some of these articles incorporated ML to improve the disease diagnosis. Hence, we categorized selected papers into 12 different ML methods, as can be seen in Tab. 7. Among the researchers, support vector machine (SVM) has been ranked one with 22.73%. This shows the efficiency of the SVM method to improve the diagnostic process of diseases. Convolution neural network (CNN) method was ranked second with 15.91%. Other methods, which included proprietary algorithms using ML methods and a combination of various ML methods, ranked third with 13.64%. With 11.36%, random forest (RF) was ranked fourth. Artificial neural network (ANN), deep ANN, and eXtreme gradient boosting (XGBoost) were ranked fifth with 6.82%. The classification and regression trees (CART) method was ranked sixth with 4.55%. However, bayesian classifier (BC), decision tree (DT), and gradient boosting (GB) were ranked last at 2.27%. This shows that these three ML methods are least preferred in improving the disease diagnosis process.
We have summarized the distribution of ML methods by year in Fig. 4. Accordingly, we observe that the use of hybrid methods has been increasing over the years (1 in 2015, 2 in 2017, and 4 in 2019) to improve the accuracy of the ML models. From the year 2016, we observe that SVM has always been used over the years, with the highest of 3 articles each in the year 2017 and 2019. This shows its popularity over the years among the researchers. Furthermore, with one article each 2018, 2019, and 2020 XGBoost has shown its consistent use to improve the diagnosis. Also, we observe an increase in the use of the CNN method from 2017.
3.5 Distribution of ML Methods Applied in Published Articles Based on Clinical Aspects
In the context of disease diagnosis using ML, we would like to know which diseases were considered more. Moreover, in which medical disciplines were researchers more interested is one of the objectives of this research. For this reason, the eligible articles in this research were classified by diseases and the implementation of ML methods. To better understand the distribution of ML for disease diagnosis, we analyzed the articles based on medical disciplines. Fig. 5 represents the pie chart for the frequency of medical disciplines. Based on the diseases, 18 medical disciplines were identified. From Fig. 5, it is observed that 13.64% of studies were carried out in cardiology and endocrinology. Probably due to the large number of diseases associated with them. Infectious disease, oncology, and pulmonology were ranked second with a 9.09%. With a 6.82%, dermatology and nephrology were ranked third. Neurology, rheumatology, and urology were ranked fourth with 4.55%. At last were ranked critical care, gastroenterology, hepatology, ophthalmology, pediatrics, periodontology, vascular surgery, and virology with 2.27% each.
We conducted this study to review the impact of ML in disease diagnosis. As per our knowledge, fewer articles have been published that systematically analyze academic articles using ML for disease diagnosis. Hence, the results and analysis of this study can be considered to assess the impact of ML in the medical domain and its efficiency in improving the disease diagnosis. This study considered the articles from the year 2015 to 2020. We identified 44 articles applying ML methods to improve disease diagnosis over this period. One of the objectives of this study was to determine which ML methods were used most by researchers for diagnosis, as the answer to this question determines the efficiency of the methods. Hence, the articles were classified accordingly. One of the ways in which articles were classified was based on the number of articles published each year. According to this classification, we observed that the number of publications using ML for disease diagnosis has been rising over the years. We find that 4.55% of articles were published in 2015, whereas in 2019, 40.91% of articles were published. This article was written in mid-2020. Thus, we were able to retrieve a few articles from this year. This increase in the use of ML methods is due to its efficiency in improving the accuracy and sensitivity of models to give correct results.
We identified 12 different ML methods that were applied in our eligible papers. Although we say that these 12 methods are mostly used ML methods for disease diagnosis, we limit our findings only to medical diagnosis and do not generalize it. From our analysis, as presented in Fig. 6, we find that researchers prefer SVM, CNN, and RF over other ML methods.
However, there is also an increase in the use of hybrid/other methods. This is mainly because using the combination of various methods augments the efficiency of the model. Our study also examined the articles from a medical discipline point of view, i.e., we classified the eligible articles according to medical disciplines. This classification helped us understand which medical disciplines were chosen largely. From this study, it was evident that cardiology and endocrinology had the highest number of publications. This must be due to the fact that most of the diseases come under these two disciplines and also because of the easily available large amount of data to carry out the research. Moreover, going only by diseases explored, we find that variety of ML has been applied to a variety of diseases. This shows the effectiveness of ML in improving the accuracy of disease diagnosis. Thus, we could apply ML in any medical discipline and get the best results.
The findings of this investigation show which diseases and medical disciplines are mostly targeted by researchers and which get neglected. We also find the efficiency of ML methods in disease diagnosis. Therefore, this study could assist researchers in carrying out further work in the medical domain.
The main goal of this systematic study was to review the articles using ML for disease diagnosis and, thus, the competence of ML in improving the diagnosis of disease. For the same, we retrieved articles from year 2015 to 2020. We identified six databases including IEEE, PubMed, Science Direct, SciPub, Springer Link, and Web of Science. Further, we classified the articles based on publisher and database. Through this study, we found which databases and publishers are publishing the greatest number of articles relating to ML in disease diagnosis. We also investigated the most used ML methods and their impact on disease diagnosis. Thus, we find that all the studies have shown improvement in their results. We find that using ML not only reduces the overall cost of the treatment and assist clinicians as ‘second opinion,’ but also helps in early detection of diseases having complex structures and patterns. We also identified 12 mostly used ML methods in disease diagnosis and their effectiveness in improving the results. We also investigated the medical disciplines using ML to a large extent. Different ML methods were analyzed to understand their effectiveness in improving disease diagnosis.
Whatsoever, this study has certain limitations. The first limitation is that this systematic review was carried on from year the 2015 to 2020, i.e., for a fixed duration. Also, it has to be noted that this study was carried out up till mid of 2020. But still, through our results, we find that there is growing acceptance and adoption of ML in disease diagnosis over the years. The second limitation of our study is that we did not include articles using fuzzy logic or image processing entirely. In the future, we can include these techniques to get a generalized view and idea of the impact of each of these techniques in disease diagnosis. The third limitation of our study is that our investigation focused solely on the diagnosis of diseases. We did not include articles relating to prognosis or treatment path. In the future, the researchers can investigate the articles to study the impact of ML in prognosis as well as for treatment path.
This study could provide basic knowledge for future studies. We excluded the articles written in other languages and articles other than journals and conference papers. Thus, in the future we can consider neglected resources for investigation as studies of these resources could be valuable. Moreover, we could also identify and diagnose relationship among multiple diseases and diagnose them simultaneously to benefit patients suffering from multiple diseases, investigate with more parameters when building ML models, appropriate selection of models could decrease the time of implementation, e.g., CNN works better for image data, standardization of data for unbiased results, using deep learning, and ensemble models for better results.
Acknowledgement: Authors wish to thank Dr. Dinesh Grover, Retired Professor, Punjab Agricultural University, Punjab, India for his guidance in writing this review paper.
Funding Statement: This research was supported in part by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2020-2016-0-00312) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation), and in part by the MSIP (Ministry of Science, ICT & Future Planning), Korea, under the National Program for Excellence in SW) (2015-0-00938) supervised by the IITP (Institute for Information & communications Technology Planning & Evaluation).
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
References
1. A. Jutel. (2009). “Sociology of diagnosis: A preliminary review,” Sociology of Health & Illness, vol. 31, no. 2, pp. 278–299.
2. E. S. Holmboe and S. J. Durning. (2014). “Assessing clinical reasoning: Moving from in vitro to in vivo,” Diagnosis, vol. 1, no. 1, pp. 111–117.
3. A. Bhasale. (1998). “The wrong diagnosis: Identifying causes of potentially adverse events in general practice using incident monitoring,” Family Practice, vol. 15, no. 4, pp. 308–318.
4. M. L. Graber, N. Franklin and R. Gordon. (2005). “Diagnostic error in internal medicine,” Archives of Internal Medicine, vol. 165, no. 13, pp. 1493–1499.
5. T. K. Gandhi, A. Kachalia, E. J. Thomas, A. L. Puopolo, C. Yoon et al. (2006). “Missed and delayed diagnoses in the ambulatory setting: A study of closed malpractice claims,” Annals of Internal Medicine, vol. 145, no. 7, pp. 488–496.
6. D. Khullar, A. K. Jha and A. B. Jena. (2015). “Reducing diagnostic errors – why now?,” The New England Journal of Medicine, vol. 373, no. 26, pp. 2491–2493.
7. H. Ahmadi, M. Gholamzadeh, L. Shahmoradi, M. Nilashi and P. Rashvand. (2018). “Diseases diagnosis using fuzzy logic methods: A systematic and meta-analysis review,” Computer Methods and Programs in Biomedicine, vol. 161, pp. 145–172.
8. T. Davenport and R. Kalakota. (2019). “The potential for artificial intelligence in healthcare,” Future Healthcare Journal, vol. 6, no. 2, pp. 94–102.
9. M. I. Jordan and T. M. Mitchell. (2015). “Machine learning: Trends, perspectives, and prospects,” Science, vol. 349, no. 6245, pp. 255–260.
This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |