Now showing 1 - 10 of 131
  • Publication
    Improving Semantic Image Segmentation via Label Fusion in Semantically Textured Meshes
    Models for semantic segmentation require a large amount of hand-labeled training data which is costly and time-consuming to produce. For this purpose, we present a label fusion framework that is capable of improving semantic pixel labels of video sequences in an unsupervised manner. We make use of a 3D mesh representation of the environment and fuse the predictions of different frames into a consistent representation using semantic mesh textures. Rendering the semantic mesh using the original intrinsic and extrinsic camera parameters yields a set of improved semantic segmentation images. Due to our optimized CUDA implementation, we are able to exploit the entire c-dimensional probability distribution of annotations over c classes in an uncertainty-aware manner. We evaluate our method on the Scannet dataset where we improve annotations produced by the state-of-the-art segmentation network ESANet from 52.05% to 58.25% pixel accuracy. We publish the source code of our framework online to foster future research in this area (https://github.com/fferflo/semantic-meshes). To the best of our knowledge, this is the first publicly available label fusion framework for semantic image segmentation based on meshes with semantic textures.
  • Publication
    A Photogrammetry-based Framework to Facilitate Image-based Modeling and Automatic Camera Tracking
    We propose a framework that extends Blender to exploit Structure from Motion (SfM) and Multi-View Stereo (MVS) techniques for image-based modeling tasks such as sculpting or camera and motion tracking. Applying SfM allows us to determine camera motions without manually defining feature tracks or calibrating the cameras used to capture the image data. With MVS we are able to automatically compute dense scene models, which is not feasible with the built-in tools of Blender. Currently, our framework supports several state-of-the-art SfM and MVS pipelines. The modular system design enables us to integrate further approaches without additional effort. The framework is publicly available as an open source software package.
  • Patent
    DecaWave-Ultra-Breitband-Reflexions-Fehler-Korrektur
    Es ist eine Anordnung zur Bestimmung einer Wegstrecke eines Empfängers zu einem Reflexionspunkt einer reflektierten elektromagnetischen Wellen mit einem Ultra Wide Band (UWB) System beschrieben, bei welchem eine Kanal Impuls Antwort auslesbar ist. Das UWB System umfasst einen Sender zum Aussenden einer elektromagnetischen Welle, einen Empfänger zum Empfangen der ausgesendeten elektromagnetischen Welle, wobei der Empfänger dazu ausgebildet ist, die ausgesendete elektromagnetische Welle direkt vom Sender zu empfangen, und dazu ausgebildet ist, die ausgesendete elektromagnetische Welle als reflektierte elektromagnetische Welle zu empfangen, wobei das UWB System dazu ausgebildet ist, die Kanal Impuls Antwort auszulesen und aus der Kanal Impuls Antwort einen Einfallswinkel der reflektierten elektromagnetischen Welle zu bestimmen, und eine Phasendifferenz zwischen der direkt vom Sender empfangenen elektromagnetischen Welle und der reflektierten elektromagnetischen Welle zu bestimmen, und einen Laufzeitunterschied zwischen der ausgesendeten elektromagnetischen Welle und der reflektierten elektromagnetischen Welle basierend auf der Phasendifferenz zu bestimmen, um basierend auf der Phasendifferenz die Wegstrecke des Empfängers zu dem Reflexionspunkt der reflektierten elektromagnetischen Welle zu bestimmen. Ferner sind ein Verfahren zum Bestimmen einer Wegstrecke eines Empfängers zu einem Reflexionspunkt einer reflektierten elektromagnetischen Wellen mit einem Ultra Wide Band System und ein Computerprogramcode beschrieben.
  • Publication
    Efficient Tour Planning for a Measurement Vehicle by Combining Next Best View and Traveling Salesman
    Path planning for a measuring vehicle requires solving two popular problems from computer science, namely the search for the optimal tour and the search for the optimal viewpoint. Combining both problems results in a new variation of the Traveling Salesman Problem, which we refer to as the Explorational Traveling Salesman Problem. The solution to this problem is the optimal tour with a minimum of observations. In this paper, we formulate the basic problem, discuss it in context of the existing literature and present an iterative solution algorithm. We demonstrate how the method can be applied directly to LiDAR data using an occupancy grid. The ability of our algorithm to generate suitably efficient tours is verified based on two synthetic benchmark datasets, utilizing a ground truth determined by an exhaustive search.
  • Publication
    The MODISSA testbed: A multi-purpose platform for the prototypical realization of vehicle-related applications using optical sensors
    We present the current state of development of the sensor-equipped car MODISSA, with which Fraunhofer IOSB realizes a configurable experimental platform for hardware evaluation and software development in the context of mobile mapping and vehicle-related safety and protection. MODISSA is based on a van that has successively been equipped with a variety of optical sensors over the past few years, and contains hardware for complete raw data acquisition, georeferencing, real-time data analysis, and immediate visualization on in-car displays. We demonstrate the capabilities of MODISSA by giving a deeper insight into experiments with its specific configuration in the scope of three different applications. Other research groups can benefit from these experiences when setting up their own mobile sensor system, especially regarding the selection of hardware and software, the knowledge of possible sources of error, and the handling of the acquired sensor data.
  • Publication
    3D Surface Reconstruction from Multi-Date Satellite Images
    The reconstruction of accurate three-dimensional environment models is one of the most fundamental goals in the field of photogrammetry. Since satellite images provide suitable properties for obtaining large-scale environment reconstructions, there exist a variety of Stereo Matching based methods to reconstruct point clouds for satellite image pairs. Recently, a Structure from Motion (SfM) based approach has been proposed, which allows to reconstruct point clouds from multiple satellite images. In this work, we propose an extension of this SfM based pipeline that allows us to reconstruct not only point clouds but watertight meshes including texture information. We provide a detailed description of several steps that are mandatory to exploit state-of-the-art mesh reconstruction algorithms in the context of satellite imagery. This includes a decomposition of finite projective camera calibration matrices, a skew correction of corresponding depth maps and input images as well as the recovery of real-world depth maps from reparameterized depth values. The paper presents an extensive quantitative evaluation on multi-date satellite images demonstrating that the proposed pipeline combined with current meshing algorithms outperforms state-of-the-art point cloud reconstruction algorithms in terms of completeness and median error. We make the source code of our pipeline publicly available.
  • Publication
    LiDAR-based localization and automatic control of UAVs for mobile remote reconnaissance
    Sensor-based monitoring of the surroundings of civilian vehicles is primarily relevant for driver assistance in road traffic, whereas in military vehicles, far-reaching reconnaissance of the environment is crucial for accomplishing the respective mission. Modern military vehicles are typically equipped with electro-optical sensor systems for such observation or surveillance purposes. However, especially when the line-of-sight to the onward route is obscured or visibility conditions are generally limited, more enhanced methods for reconnaissance are needed. The obvious benefit of micro-drones (UAVs) for remote reconnaissance is well known. The spatial mobility of UAVs can provide additional information that cannot be obtained on the vehicle itself. For example, the UAV could keep a fixed position in front and above the vehicle to gather information about the area ahead, or it could fly above or around obstacles to clear hidden areas. In a military context, this is usually referred to as manned-unmanned teaming (MUM-T). In this paper, we propose the use of vehicle-based electro-optical sensors as an alternative way for automatic control of (cooperative) UAVs in the vehicle's vicinity. In its most automated form, the external control of the UAV only requires a 3D nominal position relative to the vehicle or in absolute geocoordinates. The flight path there and the maintaining of this position including obstacle avoidance are automatically calculated on-board the vehicle and permanently communicated to the UAV as control commands. We show first results of an implementation of this approach using 360° scanning LiDAR sensors mounted on a mobile sensor unit. The control loop of detection, tracking and guidance of a cooperative UAV in the local environment is demonstrated by two experiments. We show the automatic LiDAR-controlled navigation of a UAV from a starting point A to a destination point B. with and without an obstacle between A and B. The obstacle in the direct path is detected and an alternative flight route is calculated and used.
  • Publication
    MissFormer: (In-)Attention-Based Handling of Missing Observations for Trajectory Filtering and Prediction
    In applications such as object tracking, time-series data inevitably carry missing observations. Following the success of deep learning-based models for various sequence learning tasks, these models increasingly replace classic approaches in object tracking applications for inferring the objects' motion states. While traditional tracking approaches can deal with missing observations, most of their deep counterparts are, by default, not suited for this. Towards this end, this paper introduces a transformer-based approach for handling missing observations in variable input length trajectory data. The model is formed indirectly by successively increasing the complexity of the demanded inference tasks. Starting from reproducing noise-free trajectories, the model then learns to infer trajectories from noisy inputs. By providing missing tokens, binary-encoded missing events, the model learns to in-attend to missing data and infers a complete trajectory conditioned on the remaining inputs. In the case of a sequence of successive missing events, the model then acts as a pure prediction model. The abilities of the approach are demonstrated on synthetic data and real-world data reflecting prototypical object tracking scenarios.
  • Publication
    Image-Based Out-of-Distribution-Detector Principles on Graph-Based Input Data in Human Action Recognition
    Living in a complex world like ours makes it unacceptable that a practical implementation of a machine learning system assumes a closed world. Therefore, it is necessary for such a learning-based system in a real world environment, to be aware of its own capabilities and limits and to be able to distinguish between confident and unconfident results of the inference, especially if the sample cannot be explained by the underlying distribution. This knowledge is particularly essential in safety-critical environments and tasks e.g. self-driving cars or medical applications. Towards this end, we transfer image-based Out-of-Distribution (OoD)-methods to graph-based data and show the applicability in action recognition. The contribution of this work is (i) the examination of the portability of recent image-based OoD-detectors for graph-based input data, (ii) a Metric Learning-based approach to detect OoD-samples, and (iii) the introduction of a novel semi-synthetic action recognition dataset. The evaluation shows that image-based OoD-methods can be applied to graph-based data. Additionally, there is a gap between the performance on intraclass and intradataset results. First methods as the examined baseline or ODIN provide reasonable results. More sophisticated network architectures - in contrast to their image-based application - were surpassed in the intradataset comparison and even lead to less classification accuracy.
  • Publication
    Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey
    Deep Learning is a state-of-the-art technique to make inference on extensive or complex data. As a black box model due to their multilayer nonlinear structure, Deep Neural Networks are often criticized as being non-transparent and their predictions not traceable by humans. Furthermore, the models learn from artificially generated datasets, which often do not reflect reality. By basing decision-making algorithms on Deep Neural Networks, prejudice and unfairness may be promoted unknowingly due to a lack of transparency. Hence, several so-called explanators, or explainers, have been developed. Explainers try to give insight into the inner structure of machine learning black boxes by analyzing the connection between the input and output. In this survey, we present the mechanisms and properties of explaining systems for Deep Neural Networks for Computer Vision tasks. We give a comprehensive overview about the taxonomy of related studies and compare several survey papers that deal with explainability in general. We work out the drawbacks and gaps and summarize further research ideas.