Now showing 1 - 10 of 73
  • Publication
    Change detection in street environments based on mobile laser scanning
    Automated change detection based on urban mobile laser scanning data is the foundation for a whole range of applications such as building model updates, map generation for autonomous driving and natural disaster assessment. The challenge with mobile LiDAR data is that various sources of error, such as localization errors, lead to uncertainties and contradictions in the derived information. This paper presents an approach to automatic change detection using a new category of generic evidence grids that addresses the above problems. Said technique, referred to as fuzzy spatial reasoning, solves common problems of state-of-the-art evidence grids and also provides a method of inference utilizing fuzzy Boolean reasoning. Based on this, logical operations are used to determine changes and combine them with semantic information. A quantitative evaluation based on a hand-annotated version of the TUM-MLS data set shows that the proposed method is able to identify confirmed and changed elements of the environment with F1-scores of 0.93 and 0.89.
  • Publication
    Quantifying the Complexity of Standard Benchmarking Datasets for Long-Term Human Trajectory Prediction
    Methods to quantify the complexity of trajectory datasets are still a missing piece in benchmarking human trajectory prediction models. In order to gain a better understanding of the complexity of trajectory prediction tasks and following the intuition, that more complex datasets contain more information, an approach for quantifying the amount of information contained in a dataset from a prototype-based dataset representation is proposed. The dataset representation is obtained by first employing a non-trivial spatial sequence alignment, which enables a subsequent learning vector quantization (LVQ) stage. A large-scale complexity analysis is conducted on several human trajectory prediction benchmarking datasets, followed by a brief discussion on indications for human trajectory prediction and benchmarking.
  • Publication
    Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey
    Deep Learning is a state-of-the-art technique to make inference on extensive or complex data. As a black box model due to their multilayer nonlinear structure, Deep Neural Networks are often criticized as being non-transparent and their predictions not traceable by humans. Furthermore, the models learn from artificially generated datasets, which often do not reflect reality. By basing decision-making algorithms on Deep Neural Networks, prejudice and unfairness may be promoted unknowingly due to a lack of transparency. Hence, several so-called explanators, or explainers, have been developed. Explainers try to give insight into the inner structure of machine learning black boxes by analyzing the connection between the input and output. In this survey, we present the mechanisms and properties of explaining systems for Deep Neural Networks for Computer Vision tasks. We give a comprehensive overview about the taxonomy of related studies and compare several survey papers that deal with explainability in general. We work out the drawbacks and gaps and summarize further research ideas.
  • Publication
    MissFormer: (In-)Attention-Based Handling of Missing Observations for Trajectory Filtering and Prediction
    In applications such as object tracking, time-series data inevitably carry missing observations. Following the success of deep learning-based models for various sequence learning tasks, these models increasingly replace classic approaches in object tracking applications for inferring the objects' motion states. While traditional tracking approaches can deal with missing observations, most of their deep counterparts are, by default, not suited for this. Towards this end, this paper introduces a transformer-based approach for handling missing observations in variable input length trajectory data. The model is formed indirectly by successively increasing the complexity of the demanded inference tasks. Starting from reproducing noise-free trajectories, the model then learns to infer trajectories from noisy inputs. By providing missing tokens, binary-encoded missing events, the model learns to in-attend to missing data and infers a complete trajectory conditioned on the remaining inputs. In the case of a sequence of successive missing events, the model then acts as a pure prediction model. The abilities of the approach are demonstrated on synthetic data and real-world data reflecting prototypical object tracking scenarios.
  • Publication
    Change Detection and Deformation Analysis based on Mobile Laser Scanning Data of Urban Areas
    Change detection is an important tool for processing multiple epochs of mobile LiDAR data in an efficient manner, since it allows to cope with an otherwise time-consuming operation by focusing on regions of interest. State-of-the-art approaches usually either do not handle the case of incomplete observations or are computationally expensive. We present a novel method based on a combination of point clouds and voxels that is able to handle said case, thereby being computationally less expensive than comparable approaches. Furthermore, our method is able to identify special classes of changes such as partially moved, fully moved and deformed objects in addition to the appeared and disappeared objects recognized by conventional approaches. The performance of our method is evaluated using the publicly available TUM City Campus datasets, showing an overall accuracy of 88 %.
  • Publication
    Context Sensitivity of Spatio-Temporal Activity Detection using Hierarchical Deep Neural Networks in Extended Videos
    ( 2020)
    Hertlein, Felix
    ;
    ;
    The amount of available surveillance video data is increasing rapidly and therefore makes manual inspection impractical. The goal of activity detection is to automatically localize activities spatially and temporally in a large collection of video data. In this work we will answer the question to what extent context plays a role in spatio-temporal activity detection in extended videos. Towards this end we propose a hierarchical pipeline for activity detection which spatially localizes objects first and subsequently generates spatial-temporal action tubes. Additionally, a suitable metric for performance evaluation is enhanced. Thus, we evaluate our system using the TRECVID 2019 ActEV challenge dataset. We investigated the sensitivity by detecting activities multiple times with various spatial margins around the performing actor. The results showed that our pipeline and metric is suited for detecting activities in extended videos.
  • Publication
    Self-Calibration for the Time-of-Arrival Positioning
    Self-calibration of time-of-arrival positioning systems is made difficult by the non-linearity of the relevant set of equations. This work applies dimension lifting to this problem. The objective function is extended by an additional dimension to allow the dynamics of the optimization to avoid local minima. Next to the usual numerical optimization, a partially analytical method is suggested, which makes the system of equations overdetermined proportionally to the number of measurements. Results with the lifted objective function are compared to those with the unmodified objective function. For evaluation purposes, the fractions of convergence to local minima are determined, for both synthetic data with random geometrical constellations and real measurements with a reasonable constellation of base stations. It is shown that the lifted objective function provides improved convergence in all cases, often significantly so.
  • Publication
    A Short Note on Analyzing Sequence Complexity in Trajectory Prediction Benchmarks
    The analysis and quantification of sequence complexity is an open problem frequently encountered when defining trajectory prediction benchmarks. In order to enable a more informative assembly of a data basis, an approach for determining a dataset representation in terms of a small set of distinguishable prototypical sub-sequences is proposed. The approach employs a sequence alignment followed by a learning vector quantization (LVQ) stage. A first proof of concept on synthetically generated and real-world datasets shows the viability of the approach.
  • Publication
    A complementary trajectory prediction benchmark
    Existing benchmarks targeting the overall performance of trajectory prediction models lack the possibility of gaining insight into a model's behavior under specific conditions. Towards this end, a new benchmark aiming to take on a complementary role compared to existing benchmarks is proposed. It consists of synthetically generated and modified real-world trajectories from established datasets with scenario-dependent test and training splits. The benchmark provides a hierarchy of three inference tasks, representation learning, de-noising, and prediction, comprised of several test cases targeting specific aspects of a given machine learning model. This allows a differentiated evaluation of the model's behavior and generalization capabilities. As a result, a sanity check for single trajectory models is provided aiming to prevent failure cases and highlighting requirements for improving modeling capabilities.
  • Publication
    Computer Vision for Medical Infant Motion Analysis: State of the Art and RGB-D Data Set
    ( 2019) ; ; ;
    Hofmann, Ulrich G.
    ;
    Weinberger, Raphael
    ;
    Schroeder, Sebastian A.
    Assessment of spontaneous movements of infants lets trained experts predict neurodevelopmental disorders like cerebral palsy at a very young age, allowing early intervention for affected infants. An automated motion analysis system requires to accurately capture body movements, ideally without markers or attached sensors to not affect the movements of infants. A vast majority of recent approaches for human pose estimation focuses on adults, leading to a degradation of accuracy if applied to infants. Hence, multiple systems for infant pose estimation have been developed. Due to the lack of publicly available benchmark data sets, a standardized evaluation, let alone a comparison of different approaches is impossible. We fill this gap by releasing the Moving INfants In RGB-D (MINI-RGBD) (Data set available for research purposes at http://s.fhg.de/mini-rgbd) data set, created using the recently introduced Skinned Multi-Infant Linear body model (SMIL). We map real infant movements to the SMIL model with realistic shapes and textures, and generate RGB and depth images with precise ground truth 2D and 3D joint positions. We evaluate our data set with state-of-the-art methods for 2D pose estimation in RGB images and for 3D pose estimation in depth images. Evaluation of 2D pose estimation results in a PCKh rate of 88.1% and 94.5% (depending on correctness threshold), and PCKh rates of 64.2%, respectively 90.4% for 3D pose estimation. We hope to foster research in medical infant motion analysis to get closer to an automated system for early detection of neurodevelopmental disorders.