Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.061396
Special Issues
Table of Content

Open Access

ARTICLE

Video Action Recognition Method Based on Personalized Federated Learning and Spatiotemporal Features

Rongsen Wu1, Jie Xu1, Yuhang Zhang1, Changming Zhao2,*, Yiweng Xie3, Zelei Wu1, Yunji Li2, Jinhong Guo4, Shiyang Tang5,6
1 School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
2 School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China
3 Shanghai Key Lab of Intelligent Information Processing, School of CS, Fudan University, Shanghai, 200433, China
4 School of Sensing Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
5 School of Mechanical and Manufacturing Engineering, University of New South Wales, Sydney, 2052, Australia
6 School of Electronics and Computer Science, University of Southampton, Southampton, SO17 1BJ, UK
* Corresponding Author: Changming Zhao. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.061396

Received 29 December 2024; Accepted 07 March 2025; Published online 31 March 2025

Abstract

With the rapid development of artificial intelligence and Internet of Things technologies, video action recognition technology is widely applied in various scenarios, such as personal life and industrial production. However, while enjoying the convenience brought by this technology, it is crucial to effectively protect the privacy of users’ video data. Therefore, this paper proposes a video action recognition method based on personalized federated learning and spatiotemporal features. Under the framework of federated learning, a video action recognition method leveraging spatiotemporal features is designed. For the local spatiotemporal features of the video, a new differential information extraction scheme is proposed to extract differential features with a single RGB frame as the center, and a spatial-temporal module based on local information is designed to improve the effectiveness of local feature extraction; for the global temporal features, a method of extracting action rhythm features using differential technology is proposed, and a time module based on global information is designed. Different translational strides are used in the module to obtain bidirectional differential features under different action rhythms. Additionally, to address user data privacy issues, the method divides model parameters into local private parameters and public parameters based on the structure of the video action recognition model. This approach enhances model training performance and ensures the security of video data. The experimental results show that under personalized federated learning conditions, an average accuracy of 97.792% was achieved on the UCF-101 dataset, which is non-independent and identically distributed (non-IID). This research provides technical support for privacy protection in video action recognition.

Keywords

Video action recognition; personalized federated learning; spatiotemporal features; data privacy
  • 161

    View

  • 74

    Download

  • 0

    Like

Share Link