Context Sensitivity of Spatio-Temporal Activity Detection using Hierarchical Deep Neural Networks in Extended Videos

Hertlein, Felix; Münch, David; Arens, Michael

doi:10.1109/WACVW50321.2020.9096934

2020

Conference Paper

Abstract

The amount of available surveillance video data is increasing rapidly and therefore makes manual inspection impractical. The goal of activity detection is to automatically localize activities spatially and temporally in a large collection of video data. In this work we will answer the question to what extent context plays a role in spatio-temporal activity detection in extended videos. Towards this end we propose a hierarchical pipeline for activity detection which spatially localizes objects first and subsequently generates spatial-temporal action tubes. Additionally, a suitable metric for performance evaluation is enhanced. Thus, we evaluate our system using the TRECVID 2019 ActEV challenge dataset. We investigated the sensitivity by detecting activities multiple times with various spatial margins around the performing actor. The results showed that our pipeline and metric is suited for detecting activities in extended videos.

Author(s)

Hertlein, Felix

Münch, David

Arens, Michael

Mainwork

IEEE Winter Conference on Applications of Computer Vision Workshops, WACVW 2020. Proceedings

Conference

Winter Conference on Computer Vision and Applications (WACV) 2020

Options

Context Sensitivity of Spatio-Temporal Activity Detection using Hierarchical Deep Neural Networks in Extended Videos