Open Access iconOpen Access

ARTICLE

crossmark

HgaNets: Fusion of Visual Data and Skeletal Heatmap for Human Gesture Action Recognition

Wuyan Liang1, Xiaolong Xu2,*

1 School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
2 School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China

* Corresponding Author: Xiaolong Xu. Email: email

(This article belongs to the Special Issue: Machine Vision Detection and Intelligent Recognition)

Computers, Materials & Continua 2024, 79(1), 1089-1103. https://doi.org/10.32604/cmc.2024.047861

Abstract

Recognition of human gesture actions is a challenging issue due to the complex patterns in both visual and skeletal features. Existing gesture action recognition (GAR) methods typically analyze visual and skeletal data, failing to meet the demands of various scenarios. Furthermore, multi-modal approaches lack the versatility to efficiently process both uniform and disparate input patterns. Thus, in this paper, an attention-enhanced pseudo-3D residual model is proposed to address the GAR problem, called HgaNets. This model comprises two independent components designed for modeling visual RGB (red, green and blue) images and 3D skeletal heatmaps, respectively. More specifically, each component consists of two main parts: 1) a multi-dimensional attention module for capturing important spatial, temporal and feature information in human gestures; 2) a spatiotemporal convolution module that utilizes pseudo-3D residual convolution to characterize spatiotemporal features of gestures. Then, the output weights of the two components are fused to generate the recognition results. Finally, we conducted experiments on four datasets to assess the efficiency of the proposed model. The results show that the accuracy on four datasets reaches 85.40%, 91.91%, 94.70%, and 95.30%, respectively, as well as the inference time is 0.54 s and the parameters is 2.74M. These findings highlight that the proposed model outperforms other existing approaches in terms of recognition accuracy.

Keywords


Cite This Article

APA Style
Liang, W., Xu, X. (2024). Hganets: fusion of visual data and skeletal heatmap for human gesture action recognition. Computers, Materials & Continua, 79(1), 1089-1103. https://doi.org/10.32604/cmc.2024.047861
Vancouver Style
Liang W, Xu X. Hganets: fusion of visual data and skeletal heatmap for human gesture action recognition. Comput Mater Contin. 2024;79(1):1089-1103 https://doi.org/10.32604/cmc.2024.047861
IEEE Style
W. Liang and X. Xu, “HgaNets: Fusion of Visual Data and Skeletal Heatmap for Human Gesture Action Recognition,” Comput. Mater. Contin., vol. 79, no. 1, pp. 1089-1103, 2024. https://doi.org/10.32604/cmc.2024.047861



cc Copyright © 2024 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 482

    View

  • 234

    Download

  • 0

    Like

Share Link