The Internet of Things (IoT) role is instrumental in the technological advancement of the healthcare industry. Both the hardware and the core level of software platforms are the progress resulted from the accompaniment of Medicine 4.0. Healthcare IoT systems are the emergence of this foresight. The communication systems between the sensing nodes and the processors; and the processing algorithms to produce output obtained from the data collected by the sensors are the major empowering technologies. At present, many new technologies supplement these empowering technologies. So, in this research work, a practical feature extraction and classification technique is suggested for handling data acquisition besides data fusion to enhance treatment-related data. In the initial stage, IoT devices are gathered and pre-processed for fusion processing. Dynamic Bayesian Network is considered an improved balance for tractability, a tool for CDF operations. Improved Principal Component Analysis is deployed for feature extraction along with dimension reduction. Lastly, this data learning is attained through Hybrid Learning Classifier Model for data fusion performance examination. In this research, Deep Belief Neural Network and Support Vector Machine are hybridized for healthcare data prediction. Thus, the suggested system is probably a beneficial decision support tool for multiple data sources prediction and predictive ability enhancement.
Many kinds of research have recently been grabbed by many healthcare applications like Healthcare facilities management, disaster relief management, sports health managing, and home-based care [
Discrete incorporation into remote objects is done through technological progress in sensors and communications both in-home and person [
Further, IoT sensor assists in communication enhancement, deprived of necessitating human-computer interaction or human-human interaction. Wearable sensors have become popular, and they might also be knitted into apparels and accessories to monitor the wearer’s vitals [
Enhancement of data to obtain precise and appropriate outputs using wearable sensors leads to many challenges. The purpose of data mining, whether used in healthcare or business, is to identify valuable and understandable patterns by analyzing large data sets. These data patterns help to predict industry or information trends and then to determine its application. Health care monitoring is achieved by completely exploiting this data and employing data fusion techniques to create and enhance output accuracy [
Occasionally, data from one source may not be enough to decide the real-time environment [
The paper’s structure is given below. Related works are given in Section 2. Context-aware data fusion methods for context-aware healthcare management systems and definition of data management steps are illustrated in Section 3. Investigation’s outcomes are specified in Section 4, trailed by the conclusion in Section 5.
A review of traditional Multi-Sourcer Data Fusion (MSDF) solutions is outlined in this part. MSDF methods are extensively employed for melding data acquired by sensors positioned in the background, for driving knowledge abstraction process from raw data besides creating high-level conceptions.
Baloch et al. [
Gite et al. [
Rashid et al. [
Shivashankarappa et al. [
Gao et al. [
Durrant-Whyte et al. [
Zebin et al. [
Liu et al. [
Raheja et al. [
Kusakabe et al. [
The prevailing literature reviews address data imperfection concern by suggesting diverse methodologies based on numerous theoretical foundations. The detecting sensory measurements that concern misalliance with observed data expected pattern have been deeply considered in the above literature by eliminating outlier’s intention from the fusion process. The intelligent data fusion techniques primarily focus on contextual information in an innovative way to enhance reasoning accuracy and system adaptiveness.
The heterogeneous IoT data is managed by suggesting a feature selection and classification technique for data acquisition and data fusion management to enhance when some statistical method is applied to a data set to convert it from a cluster of insignificant numbers into significant output known as the statistical of data treatment. Data treatment improvement is revealed in
Data is collected. Then, essential cleaning and filtering are utilized for eliminating outliers (irregularities or unexpected values) at the pre-processing data layer. Context-aware data fusion is done in the next step. This layer displays two data blocks. One is vital signals like body temperature, Electro Cardio Gram (ECG), blood pressure, pulse rate, etc., and the other is context data. It encompasses supplementary data like environment temperature or object location. DBN is considered an improved balance for tractability, a tool for Cumulative Distribution Function (CDF) operations, and context source diverges based on circumstances.
The process of generating more constant, precise, and valuable information than that provided by any independent data source by incorporating multiple data sources is called data fusion. These processes are frequently divided as low, intermediate, or high based on the processing stage where the fusion occurs. Improved Principal Component Analysis is deployed to extract feature as well as dimension reduction. Lastly, a suitable data fusion algorithm is used based on the application for data combining. Data learning is attained through Hybrid Learning Classifier Model for data fusion performance examination, which supports tracking historical data for current data validation and future situation prediction. In this research, DBN and SVM are hybridized for healthcare data prediction.
Consequently, suggested intelligent data fusion methods merge heterogeneous data for context information extraction enabling health care applications for reacting accordingly (
Data collection from physical devices is the primary part of context acquisition, followed by pre-processing for noise elimination through filtering and estimation techniques of other measurement outliers. The erroneously labelled data items correction is now done by classification models where models operate on clean data/data sets and are single classifiers or ensemble models. The cleaned data sets refer to data items that persist after cleaning or filtering data/data sets. Preliminary filtering is performed before removing noises in data instances through filtering. A dual filtering technique is presented for data processing. Kalman Filter (KF) is a statistical state estimation technique, whereas the particle filter is a stochastic technique for moments estimation. In this dual filtering approach, the pre-processing stage is elementary for removing the noise with high efficiency compared to the single filtering approach.
The Kalman filtering algorithm estimates certain anonymous variables that give the measurements noted over a while. The usefulness of Kalman filters has been validated in lots of applications. Instead, a simple structure and requirement of less computational power are featured by Kalman filters. But for people who have no insight into estimation theory, the implementation of the Kalman filter is very complicated.
When analytic computation cannot be done, particle filtering is the stochastic process for moments’ target probability density estimation. Random numbers generation called particles is the main principle, from an “importance” distribution that is effortlessly sampled. At that time, each particle is accompanying weight that corrects variation amid target and significant probabilities. Particle filters are frequently used for posterior density mean estimation, which offers profit of estimating full target distribution deprived of any assumption in Bayesian context, which is predominantly beneficial for non-linear/non-Gaussian systems. The PF might be employed to estimate biomechanical state depending on gyroscope as well as accelerometer data.
These dual filtering steps support in achieving noiseless, properly labelled clean data for additional processing. Supposing every data instance x
for a Laplace correction that is applied.
If C(+) is very proximate to C(-), then margin amid classes |pr(+) - pr(-)| is small. This occurs in two circumstances explicitly when items labelling is done deprived of appropriate knowledge. The second circumstance is while labelling complex instances. Hence, for an instance x
—the multiple label sets of
1: A—an empty set
2: For i = 1 to N Do
3: Account numbers of the positive label and negative label in l
4: Calculate p(+) and p(-)
5: If |p(+) - p(-)|
6: The instance ‘’’ is added to the set A
7: End for
8: A filter is applied to the set
9:
10: Build a classification model f on the set
11: For i = 1 to the size of (A + B) Do
12: Use the classifier f to relabel the instance i the set A + B
13: End For
14: Update the set A+B to
15:
16: Return
DBN serves as an optimal trade-off for tractability throughout this study and becomes a better model for DF operations. Besides, the impacts of context variables are efficiently identified by DBNs, regardless of limitations through probability distributions. The data are divided by DBNs into time slices, through which the states of an instance are represented. During that, its noticeable symptoms are identified through HMMs. The states of a given feature of interest is significantly inferred using DBN, then epitomized by the hidden variable V
Defining sensors and state transitions are highly necessitated for DBN. Besides, the impact made by the system’s current state/the sensor model over sensor information is represented by the probability distribution Pb(S
For a practical formulation of
Here, Normalizing Constant is signified by η. According to Markov hypothesis, the sensor nodes in S
In which, in a time slice
Here, the normalizing constant is represented by
Then,
Here,
The process of reducing dimensionality by compressing a primary set of raw data into more adaptable groups is called feature extraction. The requirement of many computing resources for processing by the massive number of variables is the significant feature of these large data sets.
The parameters of every measurement are extracted from information associated with categorization concern in the second stage. The feature extraction process using the suggested algorithm is presented in
. Principal Component Analysis (PCA) works as a tool to build predictive systems in investigative data analysis. Most often, the genetic distance and relatedness within the populations are visualized through PCA. PCA can be carried out through decomposition or eigenvalue data decomposition of correlation/covariance matrix, generally in post-normalization of initial data. Usually, there is a mean center in each attribute when normalizing it, which subtracts the value of each data from the measured mean of its variable; hence the corresponding empirical mean (i.e., average) becomes ‘
Consequently, it normalizes the variance of each variable as much as possible for assuming it ‘1’; see Z-scores. The component scores, also termed factor scores (the converted variable values associated with a specific data point), loadings (weight required to manipulate each original standardized variable for deriving the component score), and the PCA outcomes are considered conferred. If the standardization of component scores matches with unit variance, it indicates the existence of data variance in loadings and signifies that it is the magnitude of eigenvalues; whereas, the non-standardized component scores indicate that the data variance is accompanied by component scores only, and the loadings need to be unit-scaled (“normalized”). These weights are termed Eigenvectors, which are orthogonal rotation cosines of variables into principal components or back.
Generally, an internal data structure can be depicted by the functionality of this model, thereby the variance in the data in an optimal manner. In the context where the multivariate dataset is visualized as coordinates set in high-dimensional data space, PCA can offer lower-dimensional images to the user, during which the object is projected at its most informative perspective. This is possible only through a few standard components; thus, the dimensionality reduction is carried out in the transformed data. Nevertheless, if the outliers exist in the data set, PCA’s analysis result will be highly hampered. In such a scenario of huge data volume, it becomes a challenging task to separate the outliers. To overcome this issue, the Adaptive Gaussian kernel matrix is built.
In a distributed setting, assume s nodes set be V = {v
A local data matrix Pi ∈ Rni × d on each node ‘vi’ possesses ‘
Assume projected new features having zero mean,
The projected features covariance matrix is M × M, which can be projected as,
Its eigenvectors and eigenvalues are as follows,
The above expression can be rewritten as follows,
Here, by substituting v
The following equation expresses the kernel function,
Then, multiply both sides of
Here, the matrix notation is used as,
In which,
and a
a
Then, the ensuing kernel principal components might be estimated as follows,
The kernel approaches’ power is that it does not calculate
In which c > 0 is a constant, and Gaussian kernel,
When k = 1, PCA becomes a special case, and the center is an r-dimensional subspace. This optimum r-dimensional subspace is spanned through top r right P singular vectors (principal components) that can be identified through Singular Value Decomposition (SVD).
The inputs of the IoT sensor network reached each neuron layer, during which the processes such as aggregation, pre-processing and feature extraction are carried out. To proceed further, healthcare data are predicted using Hybrid Learning Classifier Model (HLCM). Besides hybridising Deep Belief Neural Network (DBNN), SVM is carried out to predict healthcare data.
Being a feed-forward method, DBNN helps for connecting layers with every node in a multilayer perceptron. Since there are various parameters in the neural network, the output is generated with noisy information and error if the input signal is provided to the corresponding input network. Hence, for reducing the error rate, the partial derivatives of each variable are employed in this study. In feed-forward context, input to hidden layer from information layer is multiplied with specific weights. Consequently, every data source obtained by each shrouded node is summed up. Here, through the activation function, value is passed. Besides, quantities from the hidden layer for the yielding layer multiplied by different weights, and input obtained by yield is summed up. Again, the sum is passed through the activation function, and then the yield is generated. In a neural network, the target output is differentiated by the yield from the output layer.
By using the learning algorithm procedure, DBNN computation searches error function base in weight space. The combination of loads is considered an optimal solution for resolving the learning issue since it restricts the blunder capability. According to this procedure, the error function’s coherence and differentiability need to be ensured since the inclination of the learning algorithm needs to be estimated at each cycle.
In other words, such kind of enablement work needs to be utilized instead of progression capacity utilized in perceptions. Besides, the error function as to the output network layer needs to be computed. The network’s efficient learning process enhances the recognition process, and activation of function exists.
Meanwhile, the estimated output values are compared to the output of the pre-trained network. The comparison ensures accurate output, i.e., if both outputs are equal or indicate error propagation. In this process, the recognition rate can be improved by updating the values of weights and bias. In this study, the list of values, the selection of optimized weights is carried out with the help of SVM that is recognized as a global classification algorithm.
Besides, optimized extremum (minimum/maximum) values of weights can be predicted through this algorithm, and even the error value exists in the network. Subsequently, the current value is switched by the extremum selected points, and it is selected iteratively, concerning the Enhancement of recognition rate.
SVM is also called Maximum Margin Classifiers since it can perform the empirical classification error minimization and maximization of geometric margin concurrently. Moreover, SVM implicitly maps its inputs as high-dimensional feature spaces by employing kernel trick; hence it is recognized as an effective model for non-linear classification. The construction of the classifier is enabled by the kernel trick, regardless of clear indication about feature space. In SVM, the examples are represented as a point in space that is mapped. To differentiate the examples that belong to different classes, they are separated with possibly a considerable gap. Assume that a set of points is provided, which is from either of two classes, then a hyperplane accompanied by the most considerable possible points fraction belonging to the same class on the same plane will be identified by SVM. This separating hyperplane is termed as Optimal Separating Hyperplane that could increase distance in the middle of two parallel hyperplanes and the diminishing test dataset’s misclassifying examples risk. Labelled training data as data points with the form is specified in
A ‘real vector with
The ‘
In higher-dimensional data, the RBF kernel function is efficient; hence, the SVM’s Radial Basis Function (RBF) kernel is deployed as Classifier. The kernel output relies on the Euclidean distance, of which one is the testing data point, and another one is the support-vector. At the centre of RBF, the support vector determines its area of impact across the data space. The following equation expresses RBF Kernel function,
In the above equation, a kernel parameter is denoted by k, defined as a training vector. Since RBF enables a support vector to influence more across a larger area, it is proven that a large area can efficiently support a more traditional decision boundary and a smoother decision surface. Besides, the application of optimal parameter on the training dataset eases the procurement of the classifier. Thus, efficient healthcare data classification can be performed through the developed hybrid learning classifier model where ‘
In this segment, the proposed context-aware data fusion approaches are conferred, specifically for healthcare applications. In this work, the (Mobile HEALTH(MHEALTH) dataset is used for the suggested approach performance assessment.
In the dataset of MHEALTH, the information such as body motion and vital signs recordings for ten volunteers are significantly involved, accompanied by different profile and numerous physical activities. For measuring the motion of different body parts (e.g., magnetic field orientation, acceleration, and rate of turn), the sensors are placed over the chest, left ankle, and right wrist of the subject. Besides, 2-lead ECG measurements are obtained by a sensor placed over the chest, through which primary heart monitor, various arrhythmias checking or monitoring exercise impacts on ECG are performed.
In order to evaluate different performance parameters, False Positive (FP), False Negative (FN), True Positive (TP) and True Negative (TN) values are estimated initially. The parameters, namely Accuracy, F-measure, Recall and Precision, are predominantly considered for measuring the performance.
Precision refers to the ratio between the number of appropriately identified positive instances and totally expected positive instances count,
Recall or Sensitivity is defined as the ratio between the number of appropriately identified positive instances and total instances count,
F-measure refers to the Recall and Precision’s weighted average. Such that, F-measure takes false positives and negatives,
Based on the positives and negatives, the Accuracy is measured as expressed below,
Metrics | DFA | CDFT | CDFT-HLCM |
---|---|---|---|
Accuracy | 86.059 | 90.950 | 94.400 |
Precision | 83.290 | 88.920 | 93.018 |
Recall | 88.190 | 90.860 | 94.672 |
F-measure | 85.670 | 89.880 | 93.838 |
Error rate | 13.940 | 9.042 | 5.600 |
In
In
This study proposes and assesses a Hybrid Learning Classifier Model (HLCM) based Context-aware Data Fusion technique for healthcare applications. Besides, for healthcare systems, data management phases, like pre-processing, feature extraction, data processing and storage and context-aware data fusion have been described in this research study. In addition, the data fusion process is carried out accurately with the help of a dual filtering technique that tends to label the unlabelled attributes in the accumulated data. Subsequently, feature extraction and dimension reduction take place through Improved Principal Component Analysis (IPCA). Then, the HLCM helps to learn this data, through which the efficiency of data fusion can be validated. At this point, hybridization of DBNN, SVM is accomplished with regards to healthcare data prediction. The proposed HLCM has 94.40% of accuracy than the other models. Empirical findings prove the efficiency of the proposed context-aware data fusion technique to outperform the static systems since the static systems utilize each available sensor statically. In the future, this research work can be further extended by focusing on the scrutiny of the proposed technique’s generalization capability, for which the training and test data from various scenarios can be taken. Besides, the security enhancement of the predicted healthcare data can be considered.