Open Access


Applying Apache Spark on Streaming Big Data for Health Status Prediction

Ahmed Ismail Ebada1, Ibrahim Elhenawy2, Chang-Won Jeong3, Yunyoung Nam4,*, Hazem Elbakry1, Samir Abdelrazek1
1 Information Systems Department, Faculty of Computers and Information, Mansoura University, Mansoura, 35516, Egypt
2 Department of Computer Science, Faculty of Computers and Information, El-Zagazig University, Zagazig, Sharqiyah, 44519, Egypt
3 Medical Convergence Research Center, Wonkwang University, Iksan, Korea
4 Department of Computer Science and Engineering, Soonchunhyang University, Asan, Korea
* Corresponding Author: Yunyoung Nam. Email:

Computers, Materials & Continua 2022, 70(2), 3511-3527.

Received 14 April 2021; Accepted 31 May 2021; Issue published 27 September 2021


Big data applications in healthcare have provided a variety of solutions to reduce costs, errors, and waste. This work aims to develop a real-time system based on big medical data processing in the cloud for the prediction of health issues. In the proposed scalable system, medical parameters are sent to Apache Spark to extract attributes from data and apply the proposed machine learning algorithm. In this way, healthcare risks can be predicted and sent as alerts and recommendations to users and healthcare providers. The proposed work also aims to provide an effective recommendation system by using streaming medical data, historical data on a user’s profile, and a knowledge database to make the most appropriate real-time recommendations and alerts based on the sensor’s measurements. This proposed scalable system works by tweeting the health status attributes of users. Their cloud profile receives the streaming healthcare data in real time by extracting the health attributes via a machine learning prediction algorithm to predict the users’ health status. Subsequently, their status can be sent on demand to healthcare providers. Therefore, machine learning algorithms can be applied to stream health care data from wearables and provide users with insights into their health status. These algorithms can help healthcare providers and individuals focus on health risks and health status changes and consequently improve the quality of life.


Big data; streaming processing; healthcare data; machine learning; IoT data processing; Apache Spark

Cite This Article

A. Ismail Ebada, I. Elhenawy, C. Jeong, Y. Nam, H. Elbakry et al., "Applying apache spark on streaming big data for health status prediction," Computers, Materials & Continua, vol. 70, no.2, pp. 3511–3527, 2022.

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 981


  • 974


  • 0


Share Link

WeChat scan