Now showing 1 - 10 of 131
  • Publication
    Improving Semantic Image Segmentation via Label Fusion in Semantically Textured Meshes
    Models for semantic segmentation require a large amount of hand-labeled training data which is costly and time-consuming to produce. For this purpose, we present a label fusion framework that is capable of improving semantic pixel labels of video sequences in an unsupervised manner. We make use of a 3D mesh representation of the environment and fuse the predictions of different frames into a consistent representation using semantic mesh textures. Rendering the semantic mesh using the original intrinsic and extrinsic camera parameters yields a set of improved semantic segmentation images. Due to our optimized CUDA implementation, we are able to exploit the entire c-dimensional probability distribution of annotations over c classes in an uncertainty-aware manner. We evaluate our method on the Scannet dataset where we improve annotations produced by the state-of-the-art segmentation network ESANet from 52.05% to 58.25% pixel accuracy. We publish the source code of our framework online to foster future research in this area (https://github.com/fferflo/semantic-meshes). To the best of our knowledge, this is the first publicly available label fusion framework for semantic image segmentation based on meshes with semantic textures.
  • Publication
    Image-Based Out-of-Distribution-Detector Principles on Graph-Based Input Data in Human Action Recognition
    Living in a complex world like ours makes it unacceptable that a practical implementation of a machine learning system assumes a closed world. Therefore, it is necessary for such a learning-based system in a real world environment, to be aware of its own capabilities and limits and to be able to distinguish between confident and unconfident results of the inference, especially if the sample cannot be explained by the underlying distribution. This knowledge is particularly essential in safety-critical environments and tasks e.g. self-driving cars or medical applications. Towards this end, we transfer image-based Out-of-Distribution (OoD)-methods to graph-based data and show the applicability in action recognition. The contribution of this work is (i) the examination of the portability of recent image-based OoD-detectors for graph-based input data, (ii) a Metric Learning-based approach to detect OoD-samples, and (iii) the introduction of a novel semi-synthetic action recognition dataset. The evaluation shows that image-based OoD-methods can be applied to graph-based data. Additionally, there is a gap between the performance on intraclass and intradataset results. First methods as the examined baseline or ODIN provide reasonable results. More sophisticated network architectures - in contrast to their image-based application - were surpassed in the intradataset comparison and even lead to less classification accuracy.
  • Publication
    Quantifying the Complexity of Standard Benchmarking Datasets for Long-Term Human Trajectory Prediction
    Methods to quantify the complexity of trajectory datasets are still a missing piece in benchmarking human trajectory prediction models. In order to gain a better understanding of the complexity of trajectory prediction tasks and following the intuition, that more complex datasets contain more information, an approach for quantifying the amount of information contained in a dataset from a prototype-based dataset representation is proposed. The dataset representation is obtained by first employing a non-trivial spatial sequence alignment, which enables a subsequent learning vector quantization (LVQ) stage. A large-scale complexity analysis is conducted on several human trajectory prediction benchmarking datasets, followed by a brief discussion on indications for human trajectory prediction and benchmarking.
  • Patent
    Verzögerungs-Kalibrierung für Decawave UWB
    Kalibrierwertberechner zum Bestimmen von zumindest einem Kalibrierwert für ein Positionsbestimmungssystem. Der Kalibrierwertberechner ist ausgelegt, um zumindest einen Kalibrierwert, der eine Verzögerung an oder in einer Station des Positionsbestimmungssystems beschreibt, auf der Basis einer Mehrzahl von Laufzeitmessungen oder Distanzmessungen zwischen verschiedenen Paaren von Stationen des Positionsbestimmungssystems zu bestimmen.
  • Publication
    DecaWave ultra-wideband warm-up error correction
    ( 2021)
    Sidorenko, Juri
    ;
    ; ; ;
    Hugentobler, Urs
    In the field of indoor localization, ultra-wideband (UWB) technology is no longer dispensable. The market demands that the UWB hardware has to be cheap, precise and accurate. These requirements lead to the popularity of the DecaWave UWB system. The great majority of the publications about this system deals with the correction of the signal power, hardware delay or clock drift. It has traditionally been assumed that this error only appears at the beginning of the operation and is caused by the warm-up process of the crystal. In this article, we show that the warm-up error is influenced by the same error source as the signal power. To our knowledge, no scientific publication has explicitly examined the warm-up error before. This work aims to close this gap and, moreover, to present a solution which does not require any external measuring equipment and only has to be carried out once. It is shown that the empirically obtained warm-up correction curve increases the accuracy for the twoway- ranging (TWR) significantly.
  • Publication
    Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey
    Deep Learning is a state-of-the-art technique to make inference on extensive or complex data. As a black box model due to their multilayer nonlinear structure, Deep Neural Networks are often criticized as being non-transparent and their predictions not traceable by humans. Furthermore, the models learn from artificially generated datasets, which often do not reflect reality. By basing decision-making algorithms on Deep Neural Networks, prejudice and unfairness may be promoted unknowingly due to a lack of transparency. Hence, several so-called explanators, or explainers, have been developed. Explainers try to give insight into the inner structure of machine learning black boxes by analyzing the connection between the input and output. In this survey, we present the mechanisms and properties of explaining systems for Deep Neural Networks for Computer Vision tasks. We give a comprehensive overview about the taxonomy of related studies and compare several survey papers that deal with explainability in general. We work out the drawbacks and gaps and summarize further research ideas.
  • Publication
    Generating Synthetic Training Data for Deep Learning-Based UAV Trajectory Prediction
    Deep learning-based models, such as recurrent neural networks (RNNs), have been applied to various sequence learning tasks with great success. Following this, these models are increasingly replacing classic approaches in object tracking applications for motion prediction. On the one hand, these models can capture complex object dynamics with less modeling required, but on the other hand, they depend on a large amount of training data for parameter tuning. Towards this end, we present an approach for generating synthetic trajectory data of unmanned-aerial-vehicles (UAVs) in image space. Since UAVs, or rather quadrotors are dynamical systems, they can not follow arbitrary trajectories. With the prerequisite that UAV trajectories fulfill a smoothness criterion corresponding to a minimal change of higher-order motion, methods for planning aggressive quadrotors flights can be utilized to generate optimal trajectories through a sequence of 3D waypoints. By projecting these maneuver trajectories, which are suitable for controlling quadrotors, to image space, a versatile trajectory data set is realized. To demonstrate the applicability of the synthetic trajectory data, we show that an RNN-based prediction model solely trained on the generated data can outperform classic reference models on a real-world UAV tracking dataset. The evaluation is done on the publicly available ANTI-UAV dataset.
  • Publication
    Information Acquisition on Pedestrian Movements in Urban Traffic with a Mobile Multi-Sensor System
    This paper presents an approach which combines LiDAR sensors and cameras of a mobile multi-sensor system to obtain information about pedestrians in the vicinity of the sensor platform. Such information can be used, for example, in the context of driver assistance systems. In the first step, our approach starts by using LiDAR sensor data to detect and track pedestrians, benefiting from LiDAR's capability to directly provide accurate 3D data. After LiDAR-based detection, the approach leverages the typically higher data density provided by 2D cameras to determine the body pose of the detected pedestrians. The approach combines several state-of-the-art machine learning techniques: it uses a neural network and a subsequent voting process to detect pedestrians in LiDAR sensor data. Based on the known geometric constellation of the different sensors and the knowledge of the intrinsic parameters of the cameras, image sections are generated with the respective regions of interest showing only the detected pedestrians. These image sections are then processed with a method for image-based human pose estimation to determine keypoints for different body parts. These keypoints are finally projected from 2D image coordinates to 3D world coordinates using the assignment of the original LiDAR points to a particular pedestrian.
  • Publication
    Efficient Tour Planning for a Measurement Vehicle by Combining Next Best View and Traveling Salesman
    Path planning for a measuring vehicle requires solving two popular problems from computer science, namely the search for the optimal tour and the search for the optimal viewpoint. Combining both problems results in a new variation of the Traveling Salesman Problem, which we refer to as the Explorational Traveling Salesman Problem. The solution to this problem is the optimal tour with a minimum of observations. In this paper, we formulate the basic problem, discuss it in context of the existing literature and present an iterative solution algorithm. We demonstrate how the method can be applied directly to LiDAR data using an occupancy grid. The ability of our algorithm to generate suitably efficient tours is verified based on two synthetic benchmark datasets, utilizing a ground truth determined by an exhaustive search.
  • Publication
    3D Surface Reconstruction from Multi-Date Satellite Images
    The reconstruction of accurate three-dimensional environment models is one of the most fundamental goals in the field of photogrammetry. Since satellite images provide suitable properties for obtaining large-scale environment reconstructions, there exist a variety of Stereo Matching based methods to reconstruct point clouds for satellite image pairs. Recently, a Structure from Motion (SfM) based approach has been proposed, which allows to reconstruct point clouds from multiple satellite images. In this work, we propose an extension of this SfM based pipeline that allows us to reconstruct not only point clouds but watertight meshes including texture information. We provide a detailed description of several steps that are mandatory to exploit state-of-the-art mesh reconstruction algorithms in the context of satellite imagery. This includes a decomposition of finite projective camera calibration matrices, a skew correction of corresponding depth maps and input images as well as the recovery of real-world depth maps from reparameterized depth values. The paper presents an extensive quantitative evaluation on multi-date satellite images demonstrating that the proposed pipeline combined with current meshing algorithms outperforms state-of-the-art point cloud reconstruction algorithms in terms of completeness and median error. We make the source code of our pipeline publicly available.