Now showing 1 - 10 of 10
  • Publication
    Safe and Efficient Operation with Constrained Hierarchical Reinforcement Learning
    ( 2023)
    Schmoeller da Roza, Felippe
    ;
    ;
    Günnemann, Stephan
    Hierarchical Reinforcement Learning (HRL) holds the promise of enhancing sample efficiency and generalization capabilities of Reinforcement Learning (RL) agents by leveraging task decomposition and temporal abstraction, which aligns with human reasoning. However, the adoption of HRL (and RL in general) to solve problems in the real world has been limited due to, among other reasons, the lack of effective techniques that make the agents adhere to safety requirements encoded as constraints, a common practice to define the functional safety of safety-critical systems. While some constrained Reinforcement Learning methods exist in the literature, we show that regular flat policies can face performance degradation when dealing with safety constraints. To overcome this limitation, we propose a constrained HRL topology that separates planning and control, with constraint optimization achieved at the lower-level abstraction. Simulation experiments show that our approach is able to keep its performance while adhering to safety constraints, even in scenarios where the flat policy’s performance deteriorates when trying to prioritize safety.
  • Publication
    Out-of-Distribution Detection for Reinforcement Learning Agents with Probabilistic Dynamics Models
    ( 2023) ; ;
    Schmoeller da Roza, Felippe
    ;
    Günnemann, Stephan
    Reliability of reinforcement learning (RL) agents is a largely unsolved problem. Especially in situations that substantially differ from their training environment, RL agents often exhibit unpredictable behavior, potentially leading to performance loss, safety violations or catastrophic failure. Reliable decision making agents should therefore be able to cast an alert whenever they encounter situations they have never seen before and do not know how to handle. While the problem, also known as out-of-distribution (OOD) detection, has received considerable attention in other domains such as image classification or sensory data analysis, it is less frequently studied in the context of RL. In fact, there is not even a common understanding of what OOD actually means in RL. In this work, we want to bridge this gap and approach the topic of OOD in RL from a general perspective. For this, we formulate OOD in RL as severe perturbations of the Markov decision process (MDP). To detect such perturbations, we introduce a predictive algorithm utilizing probabilistic dynamics models and bootstrapped ensembles. Since existing benchmarks are sparse and limited in their complexity, we also propose a set of evaluation scenarios with OOD occurrences. A detailed analysis of our approach shows superior detection performance compared to existing baselines from related fields.
  • Publication
    Preventing Errors in Person Detection: A Part-Based Self-Monitoring Framework
    ( 2023) ;
    Matic-Flierl, Andrea
    ;
    ;
    Günnemann, Stephan
    The ability to detect learned objects regardless of their appearance is crucial for autonomous systems in real-world applications. Especially for detecting humans, which is often a fundamental task in safety-critical applications, it is vital to prevent errors. To address this challenge, we propose a self-monitoring framework that allows for the perception system to perform plausibility checks at runtime. We show that by incorporating an additional component for detecting human body parts, we are able to significantly reduce the number of missed human detections by factors of up to 9 when compared to a baseline setup, which was trained only on holistic person objects. Additionally, we found that training a model jointly on humans and their body parts leads to a substantial reduction in false positive detections by up to 50 percent compared to training on humans alone. We performed comprehensive experiments on the publicly available datasets DensePose and Pascal VOC in order to demonstrate the effectiveness of our framework.
  • Publication
    Diffusion Denoised Smoothing for Certified and Adversarial Robust Out-Of-Distribution
    ( 2023) ;
    Korth, Daniel
    ;
    ; ;
    Günnemann, Stephan
    As the use of machine learning continues to expand, the importance of ensuring its safety cannot be overstated. A key concern in this regard is the ability to identify whether a given sample is from the training distribution, or is an "Out-Of-Distribution" (OOD) sample. In addition, adversaries can manipulate OOD samples in ways that lead a classifier to make a confident prediction. In this study, we present a novel approach for certifying the robustness of OOD detection within a ℓ2-norm around the input, regardless of network architecture and without the need for specific components or additional training. Further, we improve current techniques for detecting adversarial attacks on OOD samples, while providing high levels of certified and adversarial robustness on in-distribution samples. The average of all OOD detection metrics on CIFAR10/100 shows an increase of ∼ 13%/5% relative to previous approaches. Code: https://github.com/FraunhoferIKS/distro
  • Publication
    Safe Robot Navigation Using Constrained Hierarchical Reinforcement Learning
    ( 2022)
    Schmoeller da Roza, Felippe
    ;
    ; ;
    Ning, Xiangyu
    ;
    Günnemann, Stephan
    Safe navigation is one of the steps necessary for achieving autonomous control of robots. Among different algorithms that focus on robot navigation, Reinforcement Learning (and more specifically Deep Reinforcement Learning) has shown impressive results for controlling robots with complex and high-dimensional state representations. However, when integrating methods to comply with safety requirements by means of constraint satisfaction in flat Reinforcement Learning policies, the system performance can be affected. In this paper, we propose a constrained Hierarchical Reinforcement Learning framework with a safety layer used to modify the low-level policy to achieve a safer operation of the robot. Results obtained in simulation show that the proposed method is better at retaining performance while keeping the system in a safe region when compared to a constrained flat model.
  • Publication
    Is it all a cluster game?
    ( 2022) ;
    Koner, Rajat
    ;
    ;
    Günnemann, Stephan
    It is essential for safety-critical applications of deep neural networks to determine when new inputs are significantly different from the training distribution. In this paper, we explore this out-of-distribution (OOD) detection problem for image classification using clusters of semantically similar embeddings of the training data and exploit the differences in distance relationships to these clusters between in- and out-of-distribution data. We study the structure and separation of clusters in the embedding space and find that the supervised contrastive learning leads to well separated clusters while its self-supervised counterpart fails to do so. In our extensive analysis of different training methods, clustering strategies, distance metrics and thresholding approaches, we observe that there is no clear winner. The optimal approach depends on the model architecture and selected datasets for in- and out-of-distribution. While we could reproduce the outstanding results for contrastive training on CIFAR-10 as in-distribution data, we find standard cross-entropy paired with cosine similarity outperforms all contrastive training methods when training on CIFAR-100 instead. Cross-entropy provides competitive results as compared to expensive contrastive training methods.
  • Publication
    OODformer: Out-Of-Distribution Detection Transformer
    ( 2021)
    Koner, Rajat
    ;
    ; ;
    Günnemann, Stephan
    ;
    Tresp, Volker
    A serious problem in image classification is that a trained model might perform well for input data that originates from the same distribution as the data available for model training, but performs much worse for out-of-distribution (OOD) samples. In real-world safety-critical applications, in particular, it is important to be aware if a new data point is OOD. To date, OOD detection is typically addressed using either confidence scores, auto-encoder based reconstruction, or contrastive learning. However, the global image context has not yet been explored to discriminate the non-local objectness between in-distribution and OOD samples. This paper proposes a first-of-its-kind OOD detection architecture named OODformer that leverages the contextualization capabilities of the transformer. Incorporating the transformer as the principal feature extractor allows us to exploit the object concepts and their discriminatory attributes along with their co-occurrence via visual attention. Based on contextualised embedding, we demonstrate OOD detection using both class-conditioned latent space similarity and a network confidence score. Our approach shows improved generalizability across various datasets. We have achieved a new state-of-the-art result on CIFAR-10/-100 and ImageNet30.
  • Publication
    Domain Shifts in Reinforcement Learning: Identifying Disturbances in Environments
    ( 2021) ;
    Schmoeller Roza, Felippe
    ;
    ; ;
    Günnemann, Stephan
    End-to-End Deep Reinforcement Learning (RL) systems return an action no matter what situation they are confronted with, even for situations that differ entirely from those an agent has been trained for. In this work, we propose to formalize the changes in the environment in terms of the Markov Decision Process (MDP), resulting in a more formal framework when dealing with such problems.
  • Publication
    Domain Shifts in Reinforcement Learning: Identifying Disturbances in Environments
    ( 2021) ;
    Schmoeller Roza, Felippe
    ;
    ; ;
    Günnemann, Stephan
    A significant drawback of End-to-End Deep Reinforcement Learning (RL) systems is that they return an action no matter what situation they are confronted with. This is true even for situations that differ entirely from those an agent has been trained for. Although crucial in safety-critical applications, dealing with such situations is inherently difficult. Various approaches have been proposed in this direction, such as robustness, domain adaption, domain generalization, and out-of-distribution detection. In this work, we provide an overview of approaches towards the more general problem of dealing with disturbances to the environment of RL agents and show how they struggle to provide clear boundaries when mapped to safety-critical problems. To mitigate this, we propose to formalize the changes in the environment in terms of the Markov Decision Process (MDP), resulting in a more formal framework when dealing with such problems. We apply this framework to an example real-world scenario and show how it helps to isolate safety concerns.
  • Publication
    Assessing Box Merging Strategies and Uncertainty Estimation Methods in Multimodel Object Detection
    ( 2020)
    Schmoeller Roza, Felippe
    ;
    Henne, Maximilian
    ;
    ;
    Günnemann, Stephan
    This paper examines the impact of different box merging strategies for sampling-based uncertainty estimation methods in object detection. Also, a comparison between the almost exclusively used softmax confidence scores and the predicted variances on the quality of the final predictions estimates is presented. The results suggest that estimated variances are a stronger predictor for the detection quality. However, variance-based merging strategies do not improve significantly over the confidence-based alternative for the given setup. In contrast, we show that different methods to estimate the uncertainty of the predictions have a significant influence on the quality of the ensembling outcome. Since mAP does not reward uncertainty estimates, such improvements were only noticeable on the resulting PDQ scores.