Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB

Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB

Now showing 1 - 10 of 21

Publication

Security Fence Inspection at Airports Using Object Detection

2023-11-18T21:59:48Z , Friedrich, Nils , Specker, Andreas , Beyerer, Jürgen

To ensure the security of airports, it is essential to protect the airside from unauthorized access. For this purpose, security fences are commonly used, but they require regular inspection to detect damages. However, due to the growing shortage of human specialists and the large manual effort, there is the need for automated methods. The aim is to automatically inspect the fence for damage with the help of an autonomous robot. In this work, we explore object detection methods to address the fence inspection task and localize various types of damages. In addition to evaluating four State-of-the-Art (SOTA) object detection models, we analyze the impact of several design criteria, aiming at adapting to the task-specific challenges. This includes contrast adjustment, optimization of hyperparameters, and utilization of modern backbones. The experimental results indicate that our optimized You Only Look Once v5 (YOLOv5) model achieves the highest accuracy of the four methods with an increase of 6.9% points in Average Precision (AP) compared to the baseline. Moreover, we show the real-time capability of the model. The trained models are published on GitHub: https://github.com/N-Friederich/airport_fence_inspection.

View

Publication

Object Detection in 3D Point Clouds via Local Correlation-Aware Point Embedding

2023-01-11 , Wu, Chengzhi , Pfrommer, Julius , Beyerer, Jürgen , Li, Kangning , Neubert, Boris

We present an improved approach for 3D object detection in point cloud data based on the Frustum PointNet (F-PointNet). Compared to the original F-PointNet, our newly proposed method considers the point neighborhood when computing point features. The newly introduced local neighborhood embedding operation mimics the convolutional operations in 2D neural networks. Thus features of each point are not only computed with the features of its own or of the whole point cloud but also computed especially with respect to the features of its neighbors. Experiments show that our proposed method achieves better performance than the F-Pointnet baseline on 3D object detection tasks.

View

Publication

Die Zukunft fährt selbst. Anwendungsfälle, Chancen, Herausforderungen und Handlungsempfehlungen für die autonome Mobilität der Zukunft

2023 , Boispéan, Stéphane du , Hartmann, Volker , Kiebel, Marc , Miller, Andrea , Pape, Rüdiger , Schellert, Maximilian , Schulz, Holger , Teer, Nathalie , Wigger, Jonas , Wolfert, Michael , Ziehn, Jens , Teer, Nathalie

Die Technologie des automatisierten und vernetzten Fahrens entwickelt sich rasant. Einerseits eröffnet sie Lösungen und Anwendungsfälle für die Mobilitätswende, die vor wenigen Jahren noch in weit entfernter Zukunft schienen. Andererseits haben die Technologie und deren rasante Entwicklung massive Auswirkungen auf die Industrie und stellen diese vor dem Hintergrund ihrer digitalen Transformation vor große Herausforderungen: Die Automobilindustrie trifft auf neue, technologiegetriebene Player, die auf den Markt drängen und sie insbesondere im internationalen Kontext unter Druck setzen. Doch nicht nur die Industrie und die Technologie selbst erfahren eine rasante Entwicklung, auch die rechtlichen Rahmenbedingungen wurden in den letzten Jahren entschlossen und in hoher Geschwindigkeit vorangebracht: Deutschland und die Europäische Union haben mit ihren Rechtsrahmen zum autonomen Fahren eine sehr gute Basis geschaffen, die autonome Mobilität auf die Straße zu bringen - der regulatorische Rahmen ist ein starker Wettbewerbsvorteil gegenüber anderen Ländern der Welt. Allerdings geht es nun darum, diesen zügig und konsequent umzusetzen, um die Technologie schnellstmöglich auszurollen. Dieses Whitepaper greift diese vielseitigen Chancen und Herausforderungen auf und stellt diese anhand konkreter Anwendungsfälle aus der Wirtschaft dar. Ziel ist es, einerseits einen Überblick über verschiedene Lösungen und Anwendungen zu geben und diese damit greifbarer zu machen. Andererseits werden auf Basis der Anwendungsfälle jedoch insbesondere Handlungsempfehlungen für Politik und Wirtschaft abgeleitet. Dabei beantwortet das Papier die folgenden Fragen: - Welche Möglichkeiten ergeben sich durch das autonome und vernetzte Fahren für den Personen- und Gütertransport? - Wie kann das autonome Fahren zur Mobilitätswende beitragen? - Vor welchen Herausforderungen stehen einzelne Akteure sowie auch die Branche insgesamt? - Was kann die Politik tun, um den Roll-out der Technologie voranzutreiben und die Anwendungsfälle auf die Straße zu bringen? - Und was könnten Stakeholder selbst tun, um den Prozess sicher und effizient zu gestalten? Schlussendlich möchte dieses Papier auch zu einem Dialog und zu einer verstärkten Zusammenarbeit zwischen Politik, Verwaltung, Wissenschaft, Technologie- und Mobilitätsunternehmen anregen.

View

Publication

DiffAnt: Diffusion Models for Action Anticipation

2023 , Zhong, Zeyun , Wu, Chengzhi , Martin, Manuel , Voit, Michael , Gall, Jürgen , Beyerer, Jürgen

Anticipating future actions is inherently uncertain. Given an observed video segment containing ongoing actions, multiple subsequent actions can plausibly follow. This uncertainty becomes even larger when predicting far into the future. However, the majority of existing action anticipation models adhere to a deterministic approach, neglecting to account for future uncertainties. In this work, we rethink action anticipation from a generative view, employing diffusion models to capture different possible future actions. In this framework, future actions are iteratively generated from standard Gaussian noise in the latent space, conditioned on the observed video, and subsequently transitioned into the action space. Extensive experiments on four benchmark datasets, i.e., Breakfast, 50Salads, EpicKitchens, and EGTEA Gaze+, are performed and the proposed method achieves superior or comparable results to state-of-the-art methods, showing the effectiveness of a generative approach for action anticipation. Our code and trained models will be published on GitHub.

View

Publication

A Survey on Deep Learning Techniques for Action Anticipation

2023-09-29T14:07:56Z , Zhong, Zeyun , Martin, Manuel , Voit, Michael , Gall, Jürgen , Beyerer, Jürgen

The ability to anticipate possible future human actions is essential for a wide range of applications, including autonomous driving and human-robot interaction. Consequently, numerous methods have been introduced for action anticipation in recent years, with deep learning-based approaches being particularly popular. In this work, we review the recent advances of action anticipation algorithms with a particular focus on daily-living scenarios. Additionally, we classify these methods according to their primary contributions and summarize them in tabular form, allowing readers to grasp the details at a glance. Furthermore, we delve into the common evaluation metrics and datasets used for action anticipation and provide future directions with systematical discussions.

View

Publication

SynMotor: A Benchmark Suite for Object Attribute Regression and Multi-task Learning

2023-01-11 , Wu, Chengzhi , Qiu, Linxi , Zhou, Kanran , Pfrommer, Julius , Beyerer, Jürgen

In this paper, we develop a novel benchmark suite including both a 2D synthetic image dataset and a 3D synthetic point cloud dataset. Our work is a sub-task in the framework of a remanufacturing project, in which small electric motors are used as fundamental objects. Apart from the given detection, classification, and segmentation annotations, the key objects also have multiple learnable attributes with ground truth provided. This benchmark can be used for computer vision tasks including 2D/3D detection, classification, segmentation, and multi-attribute learning. It is worth mentioning that most attributes of the motors are quantified as continuously variable rather than binary, which makes our benchmark well-suited for the less explored regression tasks. In addition, appropriate evaluation metrics are adopted or developed for each task and promising baseline results are provided. We hope this benchmark can stimulate more research efforts on the sub-domain of object attribute learning and multi-task learning in the future.

View

Publication

Attention-based Point Cloud Edge Sampling

2023 , Wu, Chengzhi , Zheng, Junwei , Pfrommer, Julius , Beyerer, Jürgen

Point cloud sampling is a less explored research topic for this data representation. The most common sampling methods nowadays are still classical random sampling and farthest point sampling. With the development of neural networks, various methods have been proposed to sample point clouds in a task-based learning manner. However, these methods are mostly generative-based, other than selecting points directly with mathematical statistics. Inspired by the Canny edge detection algorithm for images and with the help of the attention mechanism, this paper proposes a non-generative Attention-based Point cloud Edge Sampling method (APES), which can capture the outline of input point clouds. Experimental results show that better performances are achieved with our sampling method due to the important outline information it learned.

View

Publication

Effects of Architectures on Continual Semantic Segmentation

2023-02-21T15:12:01Z , Kalb, Tobias , Ahuja, Niket , Zhou, Jingxing , Beyerer, Jürgen

Research in the field of Continual Semantic Segmentation is mainly investigating novel learning algorithms to overcome catastrophic forgetting of neural networks. Most recent publications have focused on improving learning algorithms without distinguishing effects caused by the choice of neural architecture.Therefore, we study how the choice of neural network architecture affects catastrophic forgetting in class- and domain-incremental semantic segmentation. Specifically, we compare the well-researched CNNs to recently proposed Transformers and Hybrid architectures, as well as the impact of the choice of novel normalization layers and different decoder heads. We find that traditional CNNs like ResNet have high plasticity but low stability, while transformer architectures are much more stable. When the inductive biases of CNN architectures are combined with transformers in hybrid architectures, it leads to higher plasticity and stability. The stability of these models can be explained by their ability to learn general features that are robust against distribution shifts. Experiments with different normalization layers show that Continual Normalization achieves the best trade-off in terms of adaptability and stability of the model. In the class-incremental setting, the choice of the normalization layer has much less impact. Our experiments suggest that the right choice of architecture can significantly reduce forgetting even with naive fine-tuning and confirm that for real-world applications, the architecture is an important factor in designing a continual learning model.

View

Publication

Generative-Contrastive Learning for Self-Supervised Latent Representations of 3D Shapes from Multi-Modal Euclidean Input

2023-01-11T18:14:24Z , Wu, Chengzhi , Pfrommer, Julius , Zhou, Mingyuan , Beyerer, Jürgen

We propose a combined generative and contrastive neural architecture for learning latent representations of 3D volumetric shapes. The architecture uses two encoder branches for voxel grids and multi-view images from the same underlying shape. The main idea is to combine a contrastive loss between the resulting latent representations with an additional reconstruction loss. That helps to avoid collapsing the latent representations as a trivial solution for minimizing the contrastive loss. A novel switching scheme is used to cross-train two encoders with a shared decoder. The switching scheme also enables the stop gradient operation on a random branch. Further classification experiments show that the latent representations learned with our self-supervised method integrate more useful information from the additional input data implicitly, thus leading to better reconstruction and classification performance.

View

Publication

Digital Twin Core Conceptual Models and Services

2023 , Lin, Shi-Wan , Watson, Kym , Shao, Guodong , Stojanovic, Ljiljana , Zarkout, Bassam

View

Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB

Filters

Author

Editor

Institute

Organization

Subject

Has files

Type

Settings

Sort By

Results per page

Security Fence Inspection at Airports Using Object Detection

Object Detection in 3D Point Clouds via Local Correlation-Aware Point Embedding

Die Zukunft fährt selbst. Anwendungsfälle, Chancen, Herausforderungen und Handlungsempfehlungen für die autonome Mobilität der Zukunft

DiffAnt: Diffusion Models for Action Anticipation

A Survey on Deep Learning Techniques for Action Anticipation

SynMotor: A Benchmark Suite for Object Attribute Regression and Multi-task Learning

Attention-based Point Cloud Edge Sampling

Effects of Architectures on Continual Semantic Segmentation

Generative-Contrastive Learning for Self-Supervised Latent Representations of 3D Shapes from Multi-Modal Euclidean Input

Digital Twin Core Conceptual Models and Services

Options

Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB

Filters

Author

Editor

Institute

Organization

Subject

Has files

Type

Settings

Sort By

Results per page

Security Fence Inspection at Airports Using Object Detection

Object Detection in 3D Point Clouds via Local Correlation-Aware Point Embedding

Die Zukunft fährt selbst. Anwendungsfälle, Chancen, Herausforderungen und Handlungsempfehlungen für die autonome Mobilität der Zukunft

DiffAnt: Diffusion Models for Action Anticipation

A Survey on Deep Learning Techniques for Action Anticipation

SynMotor: A Benchmark Suite for Object Attribute Regression and Multi-task Learning

Attention-based Point Cloud Edge Sampling

Effects of Architectures on Continual Semantic Segmentation

Generative-Contrastive Learning for Self-Supervised Latent Representations of 3D Shapes from Multi-Modal Euclidean Input

Digital Twin Core Conceptual Models and Services