Authors: Yun Yi Hanli Wang Bowen Zhang
Publish Date: 2017/02/10
Volume: 76, Issue: 18, Pages: 18891-18913
Abstract
Human action recognition in realistic videos is an important and challenging task Recent studies demonstrate that multifeature fusion can significantly improve the classification performance for human action recognition Therefore a number of researches utilize fusion strategies to combine multiple features and achieve promising results Nevertheless previous fusion strategies ignore the correlations of different action categories To address this issue we propose a novel multifeature fusion framework which utilizes the correlations of different action categories and multiple features To describe human actions this framework combines several classical features which are extracted with deep convolutional neural networks and improved dense trajectories Moreover massive experiments are conducted on two challenging datasets to evaluate the effectiveness of our approach and the proposed approach obtains the stateoftheart classification accuracy of 681 and 933 on the HMDB51 and UCF101 datasets respectively Furthermore the proposed approach achieves better performances than five classical fusion schemes as the correlations are used to combine multiple features in this framework To the best of our knowledge this work is the first attempt to learn the correlations of different action categories for multifeature fusionThis work was supported in part by the National Natural Science Foundation of China under Grant 61622115 and Grant 61472281 the Program for Professor of Special Appointment Eastern Scholar at Shanghai Institutions of Higher Learning No GZ2015005 and the Science and Technology Projects of education bureau of Jiangxi province of China No GJJ151001
Keywords: