Now showing 1 - 10 of 38
No Thumbnail Available
Publication

The why and how of trustworthy AI

2022-09-03 , Schmitz, Anna , Akila, Maram , Hecker, Dirk , Poretschkin, Maximilian , Wrobel, Stefan

Artificial intelligence is increasingly penetrating industrial applications as well as areas that affect our daily lives. As a consequence, there is a need for criteria to validate whether the quality of AI applications is sufficient for their intended use. Both in the academic community and societal debate, an agreement has emerged under the term “trustworthiness” as the set of essential quality requirements that should be placed on an AI application. At the same time, the question of how these quality requirements can be operationalized is to a large extent still open. In this paper, we consider trustworthy AI from two perspectives: the product and organizational perspective. For the former, we present an AI-specific risk analysis and outline how verifiable arguments for the trustworthiness of an AI application can be developed. For the second perspective, we explore how an AI management system can be employed to assure the trustworthiness of an organization with respect to its handling of AI. Finally, we argue that in order to achieve AI trustworthiness, coordinated measures from both product and organizational perspectives are required.

No Thumbnail Available
Publication

Leitfaden zur Gestaltung vertrauenswürdiger Künstlicher Intelligenz (KI-Prüfkatalog)

2021 , Poretschkin, Maximilian , Schmitz, Anna , Akila, Maram , Adilova, Linara , Becker, Daniel , Cremers, Armin B. , Hecker, Dirk , Houben, Sebastian , Mock, Michael , Rosenzweig, Julia , Sicking, Joachim , Schulz, Elena , Voss, Angelika , Wrobel, Stefan

No Thumbnail Available
Publication

Maximum Margin Separations in Finite Closure Systems

2021 , Seiffahrt, Florian , Horvárth, Tamás , Wrobel, Stefan

Monotone linkage functions provide a measure for proximities between elements and subsets of a ground set. Combining this notion with Vapniks idea of support vector machines, we extend the concepts of maximal closed set and half-space separation in finite closure systems to those with maximum margin. In particular, we define the notion of margin for finite closure systems by means of monotone linkage functions and give a greedy algorithm computing a maximum margin closed set separation for two sets efficiently. The output closed sets are maximum margin half-spaces, i.e., form a partitioning of the ground set if the closure system is Kakutani. We have empirically evaluated our approach on different synthetic datasets. In addition to binary classification of finite subsets of the Euclidean space, we considered also the problem of vertex classification in graphs. Our experimental results provide clear evidence that maximal closed set separation with maximum margin results in a much better predictive performance than that with arbitrary maximal closed sets.

No Thumbnail Available
Publication

Visual Analytics for Data Scientists

2020 , Andrienko, Natalia , Andrienko, Gennady , Fuchs, Georg , Slingsby, Aidan , Turkay, Cagatay , Wrobel, Stefan

No Thumbnail Available
Publication

Data Ecosystems: A New Dimension of Value Creation Using AI and Machine Learning

2022-07-22 , Hecker, Dirk , Voß, Angelika , Wrobel, Stefan

Machine learning and artificial intelligence have become crucial factors for the competitiveness of individual companies and entire economies. Yet their successful deployment requires access to a large volume of training data often not even available to the largest corporations. The rise of trustworthy federated digital ecosystems will significantly improve data availability for all participants and thus will allow a quantum leap for the widespread adoption of artificial intelligence at all scales of companies and in all sectors of the economy. In this chapter, we will explain how AI systems are built with data science and machine learning principles and describe how this leads to AI platforms. We will detail the principles of distributed learning which represents a perfect match with the principles of distributed data ecosystems and discuss how trust, as a central value proposition of modern ecosystems, carries over to creating trustworthy AI systems.

No Thumbnail Available
Publication

Decision Snippet Features

2021 , Welke, Pascal , Alkhoury, Fouad , Bauckhage, Christian , Wrobel, Stefan

Decision trees excel at interpretability of their prediction results. To achieve required prediction accuracies, however, often large ensembles of decision trees random forests are considered, reducing interpretability due to large size. Additionally, their size slows down inference on modern hardware and restricts their applicability in low-memory embedded devices. We introduce Decision Snippet Features, which are obtained from small subtrees that appear frequently in trained random forests. We subsequently show that linear models on top of these features achieve comparable and sometimes even better predictive performance than the original random forest, while reducing the model size by up to two orders of magnitude.

No Thumbnail Available
Publication

A Novel Regression Loss for Non-Parametric Uncertainty Optimization

2021 , Sicking, Joachim , Akila, Maram , Pintz, Maximilian , Wirtz, Tim , Fischer, Asja , Wrobel, Stefan

Quantification of uncertainty is one of the most promising approaches to establish safe machine learning. Despite its importance, it is far from being generally solved, especially for neural networks. One of the most commonly used approaches so far is Monte Carlo dropout, which is computationally cheap and easy to apply in practice. However, it can underestimate the uncertainty. We propose a new objective, referred to as second-moment loss (SML), to address this issue. While the full network is encouraged to model the mean, the dropout networks are explicitly used to optimize the model variance. We intensively study the performance of the new objective on various UCI regression datasets. Comparing to the state-of-the-art of deep ensembles, SML leads to comparable prediction accuracies and uncertainty estimates while only requiring a single model. Under distribution shift, we observe moderate improvements. As a side result, we introduce an intuitive Wasserstein distance-based uncertainty measure that is non-saturating and thus allows to resolve quality differences between any two uncertainty estimates.

No Thumbnail Available
Publication

A generalized Weisfeiler-Lehman graph kernel

2022-04-27 , Schulz, Till Hendrik , Horvath, Tamas , Welke, Pascal , Wrobel, Stefan

After more than one decade, Weisfeiler-Lehman graph kernels are still among the most prevalent graph kernels due to their remarkable predictive performance and time complexity. They are based on a fast iterative partitioning of vertices, originally designed for deciding graph isomorphism with one-sided error. The Weisfeiler-Lehman graph kernels retain this idea and compare such labels with respect to equality. This binary valued comparison is, however, arguably too rigid for defining suitable graph kernels for certain graph classes. To overcome this limitation, we propose a generalization of Weisfeiler-Lehman graph kernels which takes into account a more natural and finer grade of similarity between Weisfeiler-Lehman labels than equality. We show that the proposed similarity can be calculated efficiently by means of the Wasserstein distance between certain vectors representing Weisfeiler-Lehman labels. This and other facts give rise to the natural choice of partitioning the vertices with the Wasserstein k-means algorithm. We empirically demonstrate on the Weisfeiler-Lehman subtree kernel, which is one of the most prominent Weisfeiler-Lehman graph kernels, that our generalization significantly outperforms this and other state-of-the-art graph kernels in terms of predictive performance on datasets which contain structurally more complex graphs beyond the typically considered molecular graphs.

No Thumbnail Available
Publication

Learning Weakly Convex Sets in Metric Spaces

2021 , Stadtländer, Eike , Horvath, Tamas , Wrobel, Stefan

We introduce the notion of weak convexity in metric spaces, a generalization of ordinary convexity commonly used in machine learning. It is shown that weakly convex sets can be characterized by a closure operator and have a unique decomposition into a set of pairwise disjoint connected blocks. We give two generic efficient algorithms, an extensional and an intensional one for learning weakly convex concepts and study their formal properties. Our experimental results concerning vertex classification clearly demonstrate the excellent predictive performance of the extensional algorithm. Two non-trivial applications of the intensional algorithm to polynomial PAC-learnability are presented. The first one deals with learning k-convex Boolean functions, which are already known to be efficiently PAC-learnable. It is shown how to derive this positive result in a fairly easy way by the generic intensional algorithm. The second one is concerned with the Euclidean space equipped with the Manhattan distance. For this metric space, weakly convex sets form a union of pairwise disjoint axis-aligned hyperrectangles. We show that a weakly convex set that is consistent with a set of examples and contains a minimum number of hyperrectangles can be found in polynomial time. In contrast, this problem is known to be NP-complete if the hyperrectangles may be overlapping.

No Thumbnail Available
Publication

Constructing Spaces and Times for Tactical Analysis in Football

2021 , Andrienko, Gennady , Andrienko, Natalia , Anzer, Gabriel , Bauer, P. , Budziak, G. , Fuchs, G. , Hecker, D. , Weber, H. , Wrobel, Stefan

A possible objective in analyzing trajectories of multiple simultaneously moving objects, such as football players during a game, is to extract and understand the general patterns of coordinated movement in different classes of situations as they develop. For achieving this objective, we propose an approach that includes a combination of query techniques for flexible selection of episodes of situation development, a method for dynamic aggregation of data from selected groups of episodes, and a data structure for representing the aggregates that enables their exploration and use in further analysis. The aggregation, which is meant to abstract general movement patterns, involves construction of new time-homomorphic reference systems owing to iterative application of aggregation operators to a sequence of data selections. As similar patterns may occur at different spatial locations, we also propose constructing new spatial reference systems for aligning and matching movements irrespective of their absolute locations. The approach was tested in application to tracking data from two Bundesliga games of the 2018/2019 season. It enabled detection of interesting and meaningful general patterns of team behaviors in three classes of situations defined by football experts. The experts found the approach and the underlying concepts worth implementing in tools for football analysts.