Dynamic Hand Gesture Recognition Using 3D-CNN and LSTM Networks

Muneeb Rehman; Fawad Ahmed; Muhammad Khan; Usman Tariq; Faisal Alfouzan; Nouf Alzahrani; Jawad Ahmad

doi:10.32604/cmc.2022.019586

Open Access icon Open Access

ARTICLE

Dynamic Hand Gesture Recognition Using 3D-CNN and LSTM Networks

Muneeb Ur Rehman¹, Fawad Ahmed¹, Muhammad Attique Khan², Usman Tariq³, Faisal Abdulaziz Alfouzan⁴, Nouf M. Alzahrani⁵, Jawad Ahmad^6,*

1 Department of Electrical Engineering, HITEC University Taxila, Pakistan
2 Department of Computer Science, HITEC University Taxila, Pakistan
3 College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Khraj, Saudi Arabia
4 Department of Forensic Sciences, College of Criminal Justice, Naif Arab University for Security Sciences, Riyadh, Saudi Arabia
5 Department of Information Technology, Albaha University, Albaha, Saudi Arabia
6 School of Computing, Edinburgh Napier University, UK

* Corresponding Author: Jawad Ahmad. Email: email

Computers, Materials & Continua 2022, 70(3), 4675-4690. https://doi.org/10.32604/cmc.2022.019586

Received 18 April 2021; Accepted 27 July 2021; Issue published 11 October 2021

Abstract

Recognition of dynamic hand gestures in real-time is a difficult task because the system can never know when or from where the gesture starts and ends in a video stream. Many researchers have been working on vision-based gesture recognition due to its various applications. This paper proposes a deep learning architecture based on the combination of a 3D Convolutional Neural Network (3D-CNN) and a Long Short-Term Memory (LSTM) network. The proposed architecture extracts spatial-temporal information from video sequences input while avoiding extensive computation. The 3D-CNN is used for the extraction of spectral and spatial features which are then given to the LSTM network through which classification is carried out. The proposed model is a light-weight architecture with only 3.7 million training parameters. The model has been evaluated on 15 classes from the 20BN-jester dataset available publicly. The model was trained on 2000 video-clips per class which were separated into 80% training and 20% validation sets. An accuracy of 99% and 97% was achieved on training and testing data, respectively. We further show that the combination of 3D-CNN with LSTM gives superior results as compared to MobileNetv2 + LSTM.

Keywords

Convolutional neural networks; 3D-CNN; LSTM; spatio-temporal; jester; real-time hand gesture recognition

Cite This Article

APA Style

Rehman, M.U., Ahmed, F., Khan, M.A., Tariq, U., Alfouzan, F.A. et al. (2022). Dynamic Hand Gesture Recognition Using 3D-CNN and LSTM Networks. Computers, Materials & Continua, 70(3), 4675–4690. https://doi.org/10.32604/cmc.2022.019586

Vancouver Style

Rehman MU, Ahmed F, Khan MA, Tariq U, Alfouzan FA, Alzahrani NM, et al. Dynamic Hand Gesture Recognition Using 3D-CNN and LSTM Networks. Comput Mater Contin. 2022;70(3):4675–4690. https://doi.org/10.32604/cmc.2022.019586

IEEE Style

M. U. Rehman et al., “Dynamic Hand Gesture Recognition Using 3D-CNN and LSTM Networks,” Comput. Mater. Contin., vol. 70, no. 3, pp. 4675–4690, 2022. https://doi.org/10.32604/cmc.2022.019586

BibTex EndNote RIS

Citations

1

[click to view]

Copyright © 2022 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Dynamic Hand Gesture Recognition Using 3D-CNN and LSTM Networks

Abstract

Keywords

Cite This Article

Citations

4543

2225

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link