Open Access
ARTICLE
Human Activity Recognition Based on Parallel Approximation Kernel K-Means Algorithm
1 Erciyes University, Institute of Natural and Applied Sciences, Department of Computer Engineering, 38039, Melikgazi, Kayseri, Turkey
2 Erciyes University, Engineering Faculty, Department of Computer Engineering, 38039, Melikgazi, Kayseri, Turkey
† email: bahriye@erciyes.edu.tr
* Corresponding Author: Ahmed A. M. Jamel,
Computer Systems Science and Engineering 2020, 35(6), 441-456. https://doi.org/10.32604/csse.2020.35.441
Abstract
Recently, owing to the capability of mobile and wearable devices to sense daily human activity, human activity recognition (HAR) datasets have become a large-scale data resource. Due to the heterogeneity and nonlinearly separable nature of the data recorded by these sensors, the datasets generated require special techniques to accurately predict human activity and mitigate the considerable heterogeneity. Consequently, classic clustering algorithms do not work well with these data. Hence, kernelization, which converts the data into a new feature vector representation, is performed on nonlinearly separable data. This study aims to present a robust method to perform HAR data clustering to mitigate heterogeneity in data with minimal resource consumption. Therefore, we propose a parallel approximated clustering approach to handle the computational cost of big data by addressing noise, heterogeneity, and nonlinearity in data using data reduction, filtering, and approximated clustering methods on parallel computing environments that have not been previously addressed. Our key contribution is to treat HAR as big data implemented by approximation kernel K-means approaches and fill the gap between the HAR clustering cost and parallel computing fields. We implemented our approach on Google cloud on a parallel spark cluster, which helped us to process large-scale HAR data across multiple machines of clusters. The normalized mutual information is used as validation metric to assess the quality of the clustering algorithm. Additionally, the precision, recall, f-score metrics values are obtained somehow to compare the results with a classification technique. The experimental results of our clustering approach prove its effectiveness compared with a classification technique and can efficiently detect physical activity and mitigate the heterogeneity of the datasets.Keywords
Cite This Article
Citations
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.