Deep learning-based video analysis pipeline for person detection and re-identification in aerial imagery
In addition to conventional camera networks, the deployment of drones allows for increased flexibility in surveillance tasks. Key components in modern analysis systems required to quickly assess a large amount of recorded aerial video data are the detection, tracking, and re-identification of persons. Each of the components is influenced by the characteristics of the aerial data and must be robust against challenges such as different flight altitudes and varying acquisition views and angles. In this work, we introduce a fast and efficient framework for person search which is specifically tailored to the characteristics of aerial data recorded by drones. In contrast to most of the works on person search and re-identification, we incorporate a tracking technique to add relevant context information about persons' movements to the retrieval results. In general, we focus on the three pipeline stages person detection, tracking, and re-identification as itself as well as the interplay between the components. For this, we adapt current state-of-the-art approaches for detection to the specific characteristics of aerial data and speed up the inference time by several modifications. Next, we apply a deep learning-based tracking approach, namely Deep SORT, to generate person tracks based on the detections. For the re-identification stage, we employ a lightweight re-identification model which is applied to generate features for both tracking and re-identification. To demonstrate the suitability of our proposed video analysis pipeline, we evaluate each component as well as their interplay on the P-DESTRE dataset.