Dynamic Switching State Systems for Visual Tracking
Estimating the motion state of objects is a central component of most visual tracking pipelines. Therefore, object observations provided by an appearance model, representing the object in image space, serve as input for the actual filtering and the prediction into future frames. Under real-life conditions, the dynamics of tracked objects are subject to change over time. Especially in such maneuver scenarios, current methods struggle to deal with the model mismatch due to varying system characteristics. This thesis addresses the problem of how to capture the dynamics of maneuvering objects in an efficient and reactive way. Towards this end, the perspective of recursive Bayesian filters and the perspective of deep learning approaches on state estimation are considered and their functional viewpoints are brought together. The starting point of this thesis is the interacting multiple-model (IMM) filter, as the most common representative Bayesian formulation for dealing with model mismatches or rather maneuvering objects. For a model mismatch scenario, in which tracking is done directly in image space, a state de-coupling and a re-coupling scheme are introduced as modifications for an improved design compared to the standard IMM filter. In order to deal with two maneuver types, switching noise levels and switching dynamics, recurrent neural network (RNN)-based approaches are proposed as alternatives to IMM filtering. The approaches maintain the functionality of an IMM filter while reducing the amount of required filter tuning. With a focus on applications in the surveillance and intelligent vehicle domains, the effectiveness of RNN-based solutions is demonstrated for the exemplary tasks of path prediction and intention prediction, reflecting the most common prototypical maneuver types. The presented RNN-based network yields performance comparable to other existing relevant methods on a public benchmark. The suggested modifications help to achieve a robust prediction performance with regard to switching noise levels. For sudden motion changes, a proposed RNN-based IMM surrogate can capture the change in the dynamical behavior mare reliably than the Bayesian filter counterparts. The abilities of the RNN-IMM are evaluated in extensive experiments on realworld and synthetic datasets, reflecting prototypical maneuver situations of pedestrians in the application domain of intelligent vehicles.
Karlsruhe, Inst. für Technologie (KIT), Diss., 2020
Morris, Brendan T.