This study proposes a convolutional neural network (CNN)-based identity recognition scheme using electrocardiogram (ECG) at different water temperatures (WTs) during bathing, aiming to explore the impact of ECG length on the recognition rate. ECG data was collected using non-contact electrodes at five different WTs during bathing. Ten young student subjects (seven men and three women) participated in data collection. Three ECG recordings were collected at each preset bathtub WT for each subject. Each recording is 18 min long, with a sampling rate of 200 Hz. In total, 150 ECG recordings and 150 WT recordings were collected. The R peaks were detected based on the processed ECG (baseline wandering eliminated, 50-Hz hum removed, ECG smoothing and ECG normalization) and the QRS complex waves were segmented. These segmented waves were then transformed into binary images, which served as the datasets. For each subject, the training, validation, and test data were taken from the first, second, and third ECG recordings, respectively. The number of training and validation images was 84297 and 83734, respectively. In the test stage, the preliminary classification results were obtained using the trained CNN model, and the finer classification results were determined using the majority vote method based on the preliminary results. The validation rate was 98.71%. The recognition rates were 95.00% and 98.00% when the number of test heartbeats was 7 and 17, respectively, for each subject.
With improvements in living standards, people have started to pay more attention to personal hygiene and health in daily life, and bathing has become increasingly popular. However, the number of drowning accidents while bathing has increased in recent years, with survey data showing more than 5,398 such accidents in Japan in 2018 [
As an emerging biometric modality, ECG has seen about 20 years of academic development since Biel et al. [
Traditional ECG collection mainly consists of attaching electrodes to the skin surface of the body. However, this method can cause many inconveniences to people during data collection. Kwatra et al. [
Although the above studies explored the relationship between ECG length and the recognition rate, none were based on ECG during bathing. Owing to differences in physical conditions, everyone has a different sensitivity to WTs. What is more, the water pressure on the chest and thermal stimulus on hemodynamics induce varying stress levels to subjects during bathing. As confirmed in a previous study of ours [
This study's major contributions are as follows:
We formulate identity recognition as image classification by fully utilizing the powerful image classification capabilities of CNN. Specifically, the 1-D ECG signal is converted into a 2-D binary image. We design a majority vote-based algorithm in the secondary classification stage to achieve an accurate and robust recognition system. This algorithm can increase the final recognition rate by 10%. To address the low robustness problem, our training data, validation data, and test data are taken from different ECG recordings. All the experimental data is collected in the commonly used WT (36.5–41.5°C). Additionally, to meet practical application, all the test data is consecutive in the secondary classification stage. The final recognition rate is the average of 1000 randomly test results, which could fully ensure the robustness of the trained model.
The rest of this paper is structured as follows. The data collection system and data processing process, and the feature extraction and recognition are introduced in Section 2. Section 3 focuses on the performance evaluation of the validation and test processes; a discussion is provided in Section 4. Finally, in Section 5 we draw some conclusions and outline directions of future research.
This section mainly introduces the ECG collection system, subjects and ECG recordings, data processing, features extraction and identity recognition.
The ECG collection system in this study includes four rectangular stainless steel non-contact electrodes, all placed on the bathtub wall. When the subject is in the bathtub during bathing, the four non-contact electrodes are near the right foot, right arm, left foot, and left arm, respectively. The electricity on the skin surface, which is produced by the heart's electrical activity, arrives at the four non-contact electrodes through the water, and the four limb leads are recorded. Lead I is the potential difference between the left arm (positive) and right arm (negative); lead II is that between the left foot (positive) and right arm (negative); and lead III is that between the left foot (positive) and left arm (negative). Four shielded wires are respectively welded onto the four non-contact electrodes. The four limb leads arrive at the ECG collection monitor (Open Brain Computer Interface Biosensing Ganglion Board–OpenBCI Ganglion; OpenBCI, USA) through the shielded wires, the ECG monitor and the laptop (MacBook Pro) are connected using standard Bluetooth 4.0, and all the collected ECG recordings are stored on the laptop. The designed ECG collection system in this study is shown in
The ECG recording procedures were approved by the Public University Corporation, the University of Aizu Research Ethics Committee. Written informed consent was obtained from each participant before the experiment.
Ten subjects (seven men and three women) aged 23 to 40 years old (mean ± SD: 28.5 ± 4.78 years) who are students at the University of Aizu participated in the data collection. The blood pressure, body temperature, and body weight were recorded before and after the ECG collection, and the temperature profile for WT change and room temperature were recorded every second during the ECG collection using a temperature monitor (TR-71 wb/nw; T&D Corporation, Japan).
The ECG data was collected using non-contact electrodes at five different WTs during bathing: 36.5–37.5°C, 37.5–38.5°C, 38.5–39.5°C, 39.5–40.5°C, 40.5–41.5°C, respectively. Five ECG recordings were collected from each subject at each preset bathtub WT condition, and each recording is 18 min long with a sampling rate of 200 Hz, as is shown in
The flowchart of ECG processing, feature extraction, and recognition is shown in
where
No. of subjects | Age |
Health condition | WT range |
ECG length (min) | Sampling |
---|---|---|---|---|---|
10 | 28.5 ± 4.78 | Healthy | [36.5, 41.5] | 18 | 200 |
The Daubechies wavelet at level 10 is used to decompose the raw ECG signal and the baseline wandering approximation coefficient is subtracted from the raw ECG signal after reconstructing at level 8. There is also an obvious hum noise after removing the baseline. Spectrum analysis was performed using fast Fourier transform (FFT), the FFT is defined in
The spectrum analysis results show that the main frequency component of the hum noise was 50 Hz, which was mainly produced by the electromagnetic interference between the power supply network and equipment. A second-order infinite impulse response digital notch filter was used to remove the 50-Hz hum noise. The numerator and denominator coefficients of the digital notch filter with the notch located at
Next, the 5-point moving average method was used to smoothen the ECG signal. The mathematical formula for the moving average is shown in
The ECG was then normalized into the 0–1 range using the ‘mapminmax’ function, the R peaks were detected using the ‘findpeaks’ function, and the outliers were removed using the 1-D 11th-order median filter because of its outstanding capability to suppress isolated outlier noise without blurring sharp changes in the original signal. The mathematical formula of the 1-D 11th-order median filter is shown in
As the sampling rate is 200 Hz, the sampling interval is 5 ms. The length of a complete QRS complex wave is about 80–120 ms. Therefore, to segment a complete QRS complex wave, centered on the detected R peak, the segmented sampling points are 30 (150 ms). Then, each 1-D segmented QRS complex wave is transformed into a 2-D binary image, as shown in
The training data is a 30 × 30 × 84297 3-D matrix, the validation data is a 30 × 30 × 83734 3-D matrix, and the test data is a 30 × 30 × 81867 3-D matrix. All the training, validation, and test data belonged to the first, second, and third ECG recordings, respectively. The classification process mainly consisted of two stages. A simple 2-D CNN is used during the first stage, as shown in
In the input layer, the input data is a 30 × 30 binary image of the QRS complex. In the convolution layer, there are 20 filters with a size of 9 × 9, the output of the convolution layer is a 22 × 22 × 20 3-D matrix, and the size is unchanged after the ‘ReLU’ operation. In the pooling layer, the down-sampling result is an 11 × 11 × 20 3-D matrix. After the ‘reshape’ operation, two times fully connected operations are performed. In the output layer, the ‘Softmax’ function is used to calculate the identification rate. There are 10 values in the output of the ‘Softmax’ function, each indicating the possibility of every subject. If the row of the maximum of these 10 values in the ‘Softmax’ function is the same as the label of the input data, then let the accuracy increase by 1.
To explore the impact of ECG length on the recognition rate, the majority vote method is used in the second stage.
To evaluate the performance of the trained model, we define some performance evaluation parameters: true positive (TP), false positive (FP), true negative (TN), false negative (FN), precision, recall, F-score, TP rate (TPR), FP rate (FPR). Specifically,
During the training stage, different combinations of important training parameters were tested. Finally, the learning rate was set to 0.01, the batch size to 256, and the epoch to 40. When the training process was finished, the validation data was used to test the trained model and the validation result is shown in
Before the majority vote process, the trained model was directly used to explore the impact of the ECG length on the recognition rate using the test data. The majority vote based on the trained model classification result was then used to explore the impact of ECG length on the recognition rate using the test data. To verify the robustness and reduce the random errors of the trained model, 1000 times operations were conducted in both above two conditions. The relationship between ECG length and recognition rate based on 1000 times experiments before and after majority vote is shown in
Subject | TP | FP | FN | TN | Precision (%) | F-score (%) | TPR (%) | FPR (%) |
---|---|---|---|---|---|---|---|---|
1 | 8934 | 6 | 7 | 74787 | 99.93 | 99.93 | 99.92 | 0.01 |
2 | 7440 | 359 | 25 | 75910 | 95.40 | 97.48 | 99.67 | 0.47 |
3 | 8589 | 2 | 107 | 75036 | 99.98 | 99.37 | 98.77 | 0.00 |
4 | 6918 | 65 | 44 | 76707 | 99.07 | 99.22 | 99.37 | 0.08 |
5 | 8519 | 5 | 16 | 75194 | 99.94 | 99.88 | 99.81 | 0.01 |
6 | 7817 | 432 | 205 | 75280 | 94.76 | 96.09 | 97.44 | 0.57 |
7 | 8175 | 47 | 23 | 75489 | 99.43 | 99.57 | 99.72 | 0.06 |
8 | 8142 | 29 | 513 | 75050 | 99.65 | 96.78 | 94.07 | 0.04 |
9 | 9255 | 102 | 130 | 74247 | 98.91 | 98.76 | 98.61 | 0.14 |
10 | 8868 | 30 | 7 | 74829 | 99.66 | 99.79 | 99.92 | 0.04 |
Total | 82657 | 1077 | 1077 | 752529 | 98.71 | 98.71 | 98.71 | 0.14 |
This study explores the impact of ECG length on the recognition rate at different WTs during bathing. Because of differences in sex, age, height, weight, heart shape and size, etc., the ECG pattern differs among individuals. As the most discriminative feature, the QRS complex is often taken as the biometric marker. The proposed method thus transforms the 1-D QRS complex into a 2-D binary image and takes this binary image as the recognition marker. The binary image includes a complete QRS complex with 30 × 30 pixels. Unlike the RGB and grayscale images, the binary image only includes 0 and 1, which can greatly reduce the computational complexity. The CNN has a powerful image classification ability. We thus transform the identity recognition problem into an image classification problem. To address the problem of computational complexity, the designed system only includes input layer, convolutional layer, ReLU layer, pooling layer, and fully connected layer. This model has very low computational complexity and high classification ability. The classification curve can thus quickly converge and achieve high accuracy.
In the experiment, the training, validation, and test data are from different ECG recordings. Such a data structure ensures that the experimental results are robust and practical. When the training process is finished, the final validation precision is 98.71%, as shown in
The entire signal acquisition process can be divided into three stages. The first stage is the adaptation period, with about two minutes duration. Because of the water pressure on the chest and thermal stimulus on hemodynamics, the heart rate changes considerably after the subject enters the bathtub, and the ECG signal fluctuates greatly during this period. The second stage is the stable period, with a duration of about 10 min. In this stage, the subject has adapted to the WT environment, the heart rate is relatively more stable, and the ECG is not as fluctuating as it was in the first stage. The third stage is the pressure rise period, with a duration of about 6 min. In this stage, the subject has been in the bathtub for about 12 min and has become very tired because of the water pressure on the chest and thermal stimulus on hemodynamics. Mental and physical stress greatly increases in this stage. For one subject, the ECG signals of the three different periods vary greatly. For different subjects, due to individual differences, especially sensitivity to water pressure and WT, the ECG signals are extremely different. In this study, R peaks detection is necessary before the QRS complex is segmented. The biggest challenge of this study is thus accurately detecting the locations of the R peaks. Bathtub ECG is unlike ECG in a resting state. The same algorithm is somewhat powerless in terms of accurate detection of all the R peaks of the bathtub ECG within 18 min, but it is even more strained in terms of accurate detection of R peaks of different subjects. The detected R peaks result includes many outliers, and they must be removed before the QRS complex is segmented.
During the feature extraction stage, the 1-D QRS complex is transformed into a 2-D binary image. The positions of the sampling points are slightly changed after the transformation, where they are resaved in the matrix of the new binary image after being rounded up. Although such a transformation method could greatly reduce the computational complexity, it will also lose much useful QRS information. The designed algorithm cannot ensure that all the inherent features are fully retained. What's more, the segmented QRS complex only includes 30 sampling points (150 ms). The duration of a normal and complete QRS complex is about 80 to 120 ms. In terms of QRS duration, this segmentation is reasonable. However, from the perspective of converting it into a binary image, the number of sampling points is insufficient. The conventional assumption is that the more sampling points, the more the converted image can retain the inherent characteristics of the original signal. Of course, further experiments are needed to verify whether the sampling points will also affect the recognition rate.
To the best of our knowledge, this is the first study of the impact of ECG length on the identity recognition rate at different WTs during bathing. Our experimental results show that the recognition rate exceeds 98% when the number of QRS is 20 and then almost becomes stable. The designed recognition system has high precision and robustness, which could make it suitable for practical application.
Although some discoveries were made in this study, there were also some limitations. First, during the ECG collection stage, all the subjects were told to keep as still as possible, which caused them some inconvenience. This measure can greatly reduce the noise of the ECG signal and bring convenience to the later signal processing and experiment. However, it is quite different from an actual application scenario, where a person cannot always stay still while in a bathtub. The ECG signal is very weak, in the order of millivolts. During the ECG collection process, non-contact electrodes are used. The weak electricity on the skin surface is conducted to the electrode through the water medium. Strenuous movement induces considerable motion artifacts to the ECG signal. Second, the number of subjects is too small. Third, the controlled WT conditions are not unified for all subjects. When the WT exceeds 40°C, some subjects feel that it is uncomfortably hot. Therefore, for personal safety, we must temporarily lower the WT. Fourth, the water pressure on the chest and thermal stimulus on hemodynamics caused additional stress to the subjects during the ECG collection process, which may create some bias regarding the results.
In future research, we will explore factors behind false recognition for the designed system and increase its recognition rate and robustness. We will also explore other automatic identity recognition methods using ECG during bathing. What is more, we will change time domain ECG to frequency domain and extract its feature as a biometric marker. Besides, in terms of long-term health care purposes, various security and privacy challenges of the physiological signals have been widely concerned, which need to be addressed. Therefore, we will explore how to use some latest encryption technology to strengthen the security of physiological signals [
The authors thank all participants for their cooperation during the data collection.