Shuren Zhou1, *, Le Chen1, Vijayan Sugumaran2
CMC-Computers, Materials & Continua, Vol.63, No.3, pp. 1545-1561, 2020, DOI:10.32604/cmc.2020.09867
- 30 April 2020
Abstract The two-stream convolutional neural network exhibits excellent performance
in the video action recognition. The crux of the matter is to use the frames already
clipped by the videos and the optical flow images pre-extracted by the frames, to train a
model each, and to finally integrate the outputs of the two models. Nevertheless, the
reliance on the pre-extraction of the optical flow impedes the efficiency of action
recognition, and the temporal and the spatial streams are just simply fused at the ends,
with one stream failing and the other stream succeeding. We propose a novel More >