Journal Title
Title of Journal: Int J Comput Vis
|
Abbravation: International Journal of Computer Vision
|
|
|
|
|
Authors: Michael Sapienza Fabio Cuzzolin Philip HS Torr
Publish Date: 2013/10/13
Volume: 110, Issue: 1, Pages: 30-47
Abstract
Current stateoftheart action classification methods aggregate space–time features globally from the entire video clip under consideration However the features extracted may in part be due to irrelevant scene context or movements shared amongst multiple action classes This motivates learning with local discriminative parts which can help localise which parts of the video are significant Exploiting spatiotemporal structure in the video should also improve results just as deformable part models have proven highly successful in object recognition However whereas objects have clear boundaries which means we can easily define a ground truth for initialisation 3D space–time actions are inherently ambiguous and expensive to annotate in large datasets Thus it is desirable to adapt pictorial star models to action datasets without location annotation and to features invariant to changes in pose such as bagoffeature and Fisher vectors rather than lowlevel HoG Thus we propose local deformable spatial bagoffeatures in which local discriminative regions are split into a fixed grid of parts that are allowed to deform in both space and time at testtime In our experimental evaluation we demonstrate that by using local space–time action parts in a weakly supervised setting we are able to achieve stateoftheart classification performance whilst being able to localise actions even in the most challenging video datasets
Keywords:
.
|
Other Papers In This Journal:
|