Authors: Meng Chen Liyu Gong Tianjiang Wang Qi Feng
Publish Date: 2013/11/08
Volume: 74, Issue: 6, Pages: 2127-2142
Abstract
This paper presents a novel framework for human action recognition based on a newly proposed midlevel feature representation method named Lie Algebrized Guassians LAG As an action sequence can be treated as a 3D object in spacetime space we address the action recognition problem by recognizing 3D objects and characterize 3D objects by the probability distributions of local spatiotemporal features First for each video we densely sample local spatiotemporal features eg HOG3D at multiple scales confined in bounding boxes of human body Moreover normalized spatial coordinates are appended to local descriptor in order to capture spatial position information Then the distribution of local features in each video is modeled by a Gaussian Mixture Model GMM To estimate the parameters of videospecific GMMs a global GMM is trained using all training data and videospecific GMMs are adapted from the global GMM Then the LAG is adopted to vectorize those videospecific GMMs Finally linear SVM is employed for classification Experimental results on the KTH and UCF Sports dataset show that our method achieves stateoftheart performance
Keywords: