Now showing 1 - 10 of 12
No Thumbnail Available
Publication

Concept-Guided LLM Agents for Human-AI Safety Codesign

2024 , Geissler, Florian , Roscher, Karsten , Trapp, Mario

Generative AI is increasingly important in software engineering, including safety engineering, where its use ensures that software does not cause harm to people. This also leads to high quality requirements for generative AI. Therefore, the simplistic use of Large Language Models (LLMs) alone will not meet these quality demands. It is crucial to develop more advanced and sophisticated approaches that can effectively address the complexities and safety concerns of software systems. Ultimately, humans must understand and take responsibility for the suggestions provided by generative AI to ensure system safety. To this end, we present an efficient, hybrid strategy to leverage LLMs for safety analysis and Human-AI codesign. In particular, we develop a customized LLM agent that uses elements of prompt engineering, heuristic reasoning, and retrieval-augmented generation to solve tasks associated with predefined safety concepts, in interaction with a system model graph. The reasoning is guided by a cascade of micro-decisions that help preserve structured information. We further suggest a graph verbalization which acts as an intermediate representation of the system model to facilitate LLM-graph interactions. Selected pairs of prompts and responses relevant for safety analytics illustrate our method for the use case of a simplified automated driving system.

No Thumbnail Available
Publication

Concept Correlation and its Effects on Concept-Based Models

2023 , Heidemann, Lena , Monnet, Maureen , Roscher, Karsten

Concept-based learning approaches for image classification, such as Concept Bottleneck Models, aim to enable interpretation and increase robustness by directly learning high-level concepts which are used for predicting the main class. They achieve competitive test accuracies compared to standard end-to-end models. However, with multiple concepts per image and binary concept annotations (without concept localization), it is not evident if the output of the concept model is truly based on the predicted concepts or other features in the image. Additionally, high correlations between concepts would allow a model to predict a concept with high test accuracy by simply using a correlated concept as a proxy. In this paper, we analyze these correlations between concepts in the CUB and GTSRB datasets and propose methods beyond test accuracy for evaluating their effects on the performance of a concept-based model trained on this data. To this end, we also perform a more detailed analysis on the effects of concept correlation using synthetically generated datasets of 3D shapes. We see that high concept correlation increases the risk of a model's inability to distinguish these concepts. Yet simple techniques, like loss weighting, show promising initial results for mitigating this issue.

No Thumbnail Available
Publication

Towards Human-Interpretable Prototypes for Visual Assessment of Image Classification Models

2023 , Sinhamahapatra, Poulami , Heidemann, Lena , Monnet, Maureen , Roscher, Karsten

Explaining black-box Artificial Intelligence (AI) models is a cornerstone for trustworthy AI and a prerequisite for its use in safety critical applications such that AI models can reliably assist humans in critical decisions. However, instead of trying to explain our models post-hoc, we need models which are interpretable-by-design built on a reasoning process similar to humans that exploits meaningful high-level concepts such as shapes, texture or object parts. Learning such concepts is often hindered by its need for explicit specification and annotation up front. Instead, prototype-based learning approaches such as ProtoPNet claim to discover visually meaningful prototypes in an unsupervised way. In this work, we propose a set of properties that those prototypes have to fulfill to enable human analysis, e.g. as part of a reliable model assessment case, and analyse such existing methods in the light of these properties. Given a 'Guess who?' game, we find that these prototypes still have a long way ahead towards definite explanations. We quantitatively validate our findings by conducting a user study indicating that many of the learnt prototypes are not considered useful towards human understanding. We discuss about the missing links in the existing methods and present a potential real-world application motivating the need to progress towards truly human-interpretable prototypes.

No Thumbnail Available
Publication

AI in MedTech Production. Visual Inspection for Quality Assurance

2021 , Roscher, Karsten

Automated visual inspection based on machine learning and computer vision algorithms is a promising approach to ensure the quality of critical medical implants and equipments. However, limited availability of data and potentially unpredictable deep learning models pose major challenges to bring such solutions to life and to the market. This talk addresses the open challenges as well as current research directions for dependable visual inspection in quality assurance of medical products.

No Thumbnail Available
Publication

Can you trust your Agent? The Effect of Out-of-Distribution Detection on the Safety of Reinforcement Learning Systems

2024 , Haider, Tom , Roscher, Karsten , Herd, Benjamin , Schmoeller da Roza, Felippe , Burton, Simon

Deep Reinforcement Learning (RL) has the potential to revolutionize the automation of complex sequential decision-making problems. Although it has been successfully applied to a wide range of tasks, deployment to real-world settings remains challenging and is often limited. One of the main reasons for this is the lack of safety guarantees for conventional RL algorithms, especially in situations that substantially differ from the learning environment. In such situations, state-of-the-art systems will fail silently, producing action sequences without signalizing any uncertainty regarding the current input. Recent works have suggested Out-of-Distribution (OOD) detection as an additional reliability measure when deploying RL in the real world. How these mechanisms benefit the safety of the entire system, however, is not yet fully understood. In this work, we study how OOD detection contributes to the safety of RL systems by describing the challenges involved with detecting unknown situations. We derive several definitions for unknown events and explore potential avenues for a successful safety argumentation, building on recent work for safety assurance of Machine Learning components. In a series of experiments, we compare different OOD detectors and show how difficult it is to distinguish harmless from potentially unsafe OOD events in practice, and how standard evaluation schemes can lead to deceptive conclusions, depending on which definition of unknown is applied.

No Thumbnail Available
Publication

Towards Probabilistic Safety Guarantees for Model-Free Reinforcement Learning

2023 , Schmoeller da Roza, Felippe , Roscher, Karsten , Günneman, Stephan

Improving safety in model-free Reinforcement Learning is necessary if we expect to deploy such systems in safety-critical scenarios. However, most of the existing constrained Reinforcement Learning methods have no formal guarantees for their constraint satisfaction properties. In this paper, we show the theoretical formulation for a safety layer that encapsulates model epistemic uncertainty over a distribution of constraint model approximations and can provide probabilistic guarantees of constraint satisfaction.

No Thumbnail Available
Publication

Is it all a cluster game?

2022 , Sinhamahapatra, Poulami , Koner, Rajat , Roscher, Karsten , Günnemann, Stephan

It is essential for safety-critical applications of deep neural networks to determine when new inputs are significantly different from the training distribution. In this paper, we explore this out-of-distribution (OOD) detection problem for image classification using clusters of semantically similar embeddings of the training data and exploit the differences in distance relationships to these clusters between in- and out-of-distribution data. We study the structure and separation of clusters in the embedding space and find that the supervised contrastive learning leads to well separated clusters while its self-supervised counterpart fails to do so. In our extensive analysis of different training methods, clustering strategies, distance metrics and thresholding approaches, we observe that there is no clear winner. The optimal approach depends on the model architecture and selected datasets for in- and out-of-distribution. While we could reproduce the outstanding results for contrastive training on CIFAR-10 as in-distribution data, we find standard cross-entropy paired with cosine similarity outperforms all contrastive training methods when training on CIFAR-100 instead. Cross-entropy provides competitive results as compared to expensive contrastive training methods.

No Thumbnail Available
Publication

Towards Human-Interpretable Prototypes for Visual Assessment of Image Classification Models

2023 , Sinhamahapatra, Poulami , Heidemann, Lena , Monnet, Maureen , Roscher, Karsten

Explaining black-box Artificial Intelligence (AI) models is a cornerstone for trustworthy AI and a prerequisite for its use in safety critical applications such that AI models can reliably assist humans in critical decisions. However, instead of trying to explain our models post-hoc, we need models which are interpretable-by-design built on a reasoning process similar to humans that exploits meaningful high-level concepts such as shapes, texture or object parts. Learning such concepts is often hindered by its need for explicit specification and annotation up front. Instead, prototype-based learning approaches such as ProtoPNet claim to discover visually meaningful prototypes in an unsupervised way. In this work, we propose a set of properties that those prototypes have to fulfill to enable human analysis, e.g. as part of a reliable model assessment case, and analyse such existing methods in the light of these properties. Given a ‘Guess who?’ game, we find that these prototypes still have a long way ahead towards definite explanations. We quantitatively validate our findings by conducting a user study indicating that many of the learnt prototypes are not considered useful towards human understanding. We discuss about the missing links in the existing methods and present a potential real-world application motivating the need to progress towards truly human-interpretable prototypes.

No Thumbnail Available
Publication

Out-of-Distribution Detection for Reinforcement Learning Agents with Probabilistic Dynamics Models

2023 , Haider, Tom , Roscher, Karsten , Schmoeller da Roza, Felippe , Günnemann, Stephan

Reliability of reinforcement learning (RL) agents is a largely unsolved problem. Especially in situations that substantially differ from their training environment, RL agents often exhibit unpredictable behavior, potentially leading to performance loss, safety violations or catastrophic failure. Reliable decision making agents should therefore be able to cast an alert whenever they encounter situations they have never seen before and do not know how to handle. While the problem, also known as out-of-distribution (OOD) detection, has received considerable attention in other domains such as image classification or sensory data analysis, it is less frequently studied in the context of RL. In fact, there is not even a common understanding of what OOD actually means in RL. In this work, we want to bridge this gap and approach the topic of OOD in RL from a general perspective. For this, we formulate OOD in RL as severe perturbations of the Markov decision process (MDP). To detect such perturbations, we introduce a predictive algorithm utilizing probabilistic dynamics models and bootstrapped ensembles. Since existing benchmarks are sparse and limited in their complexity, we also propose a set of evaluation scenarios with OOD occurrences. A detailed analysis of our approach shows superior detection performance compared to existing baselines from related fields.

No Thumbnail Available
Publication

Beyond Test Accuracy: The Effects of Model Compression on CNNs

2022 , Schwaiger, Adrian , Schwienbacher, Kristian , Roscher, Karsten

Model compression is widely employed to deploy convolutional neural networks on devices with limited computational resources or power limitations. For high stakes applications, such as autonomous driving, it is, however, important that compression techniques do not impair the safety of the system. In this paper, we therefore investigate the changes introduced by three compression methods - post-training quantization, global unstructured pruning, and the combination of both - that go beyond the test accuracy. To this end, we trained three image classifiers on two datasets and compared them regarding their performance on the class level and regarding their attention to different input regions. Although the deviations in test accuracy were minimal, our results show that the considered compression techniques introduce substantial changes to the models that reflect in the quality of predictions of individual classes and in the salience of input regions. While we did not observe the introduction of systematic errors or biases towards certain classes, these changes can significantly impact the failure modes of CNNs and thus are highly relevant for safety analyses. We therefore conclude that it is important to be aware of the changes caused by model compression and to already consider them in the early stages of the development process.