Now showing 1 - 10 of 22
No Thumbnail Available
Publication

Machine learning framework to predict nonwoven material properties from fiber graph representations

2022 , Antweiler, Dario , Harmening, Marc , Marheineke, Nicole , Schmeißer, Andre , Wegener, Raimund , Pascal Welke

Nonwoven fiber materials are omnipresent in diverse applications including insulation, clothing and filtering. Simulation of material properties from production parameters is an industry goal but a challenging task. We developed a machine learning based approach to predict the tensile strength of nonwovens from fiber lay-down settings via a regression model. Here we present an open source framework implementing the following two-step approach: First, a graph generation algorithm constructs stochastic graphs, that resemble the adhered fiber structure of the nonwovens, given a parameter space. Secondly, our regression model, learned from ODE-simulation results, predicts the tensile strength for unseen parameter combinations.

No Thumbnail Available
Publication

Decision Snippet Features

2021-05-05 , Welke, Pascal , Alkhoury, Fouad , Bauckhage, Christian , Wrobel, Stefan

Decision trees excel at interpretability of their prediction results. To achieve required prediction accuracies, however, often large ensembles of decision trees random forests are considered, reducing interpretability due to large size. Additionally, their size slows down inference on modern hardware and restricts their applicability in low-memory embedded devices. We introduce Decision Snippet Features, which are obtained from small subtrees that appear frequently in trained random forests. We subsequently show that linear models on top of these features achieve comparable and sometimes even better predictive performance than the original random forest, while reducing the model size by up to two orders of magnitude.

No Thumbnail Available
Publication

Maximum Margin Separations in Finite Closure Systems

2021 , Seiffahrt, Florian , Horvath, Tamas , Wrobel, Stefan

Monotone linkage functions provide a measure for proximities between elements and subsets of a ground set. Combining this notion with Vapniks idea of support vector machines, we extend the concepts of maximal closed set and half-space separation in finite closure systems to those with maximum margin. In particular, we define the notion of margin for finite closure systems by means of monotone linkage functions and give a greedy algorithm computing a maximum margin closed set separation for two sets efficiently. The output closed sets are maximum margin half-spaces, i.e., form a partitioning of the ground set if the closure system is Kakutani. We have empirically evaluated our approach on different synthetic datasets. In addition to binary classification of finite subsets of the Euclidean space, we considered also the problem of vertex classification in graphs. Our experimental results provide clear evidence that maximal closed set separation with maximum margin results in a much better predictive performance than that with arbitrary maximal closed sets.

No Thumbnail Available
Publication

Problem Solving with Hopfield Networks and Adiabatic Quantum Computing

2020-09-28 , Bauckhage, Christian , Sánchez, Ramsés J. , Sifa, Rafet

Our goal with this paper is to elucidate the close connection between Hopfield networks and adiabatic quantum computing. Focusing on their use in problem solving, we point out that the energy functions minimized by Hopfield networks are essentially identical to those minimized by adiabatic quantum computers. To practically illustrate this, we consider a simple textbook problem, namely the k-rooks problem, and discuss how to set it up for solution via a Hopfield network or adiabatic quantum computing.

No Thumbnail Available
Publication

Auto Encoding Explanatory Examples with Stochastic Paths

2021-05-05 , Ojeda, César , Sánchez, Ramsés J. , Cvejoski, Kostadin , Schücker, Jannis , Bauckhage, Christian , Georgiev, Bogdan

In this paper we ask for the main factors that determine a classifiers decision making process and uncover such factors by studying latent codes produced by auto-encoding frameworks. To deliver an explanation of a classifiers behaviour, we propose a method that provides series of examples highlighting semantic differences between the classifiers decisions. These examples are generated through interpolations in latent space. We introduce and formalize the notion of a semantic stochastic path, as a suitable stochastic process defined in feature (data) space via latent code interpolations. We then introduce the concept of semantic Lagrangians as a way to incorporate the desired classifiers behaviour and find that the solution of the associated variational problem allows for highli ghting differences in the classifier decision. Very importantly, within our framework the classifier is used as a black-box, and only its evaluation is required.

No Thumbnail Available
Publication

Learning Deep Generative Models for Queuing Systems

2021 , Ojeda, César , Cvejoski, Kostadin , Georgiev, Bogdan , Bauckhage, Christian , Schücker, Jannis , Sánchez, Ramsés J.

Modern society is heavily dependent on large scale client-server systems with applications ranging from Internet and Communication Services to sophisticated logistics and deployment of goods. To maintain and improve such a system, a careful study of client and server dynamics is needed e.g. response/service times, aver-age number of clients at given times, etc. To this end, one traditionally relies, within the queuing theory formalism, on parametric analysis and explicit distribution forms. However, parametric forms limit the models expressiveness and could struggle on extensively large datasets. We propose a novel data-driven approach towards queuing systems: the Deep Generative Service Times. Our methodology delivers a flexible and scalable model for service and response times. We leverage the representation capabilities of Recurrent Marked Point Processes for the temporal dynamics of clients, as well as Wasserstein Generative Adversarial Network techniques, to learn deep generative models which are able to represent complex conditional service time distributions. We provide extensive experimental analysis on both empirical and synthetic datasets, showing the effectiveness of the proposed models.

No Thumbnail Available
Publication

Communication efficient distributed learning of neural networks in Big Data environments using Spark

2021 , Alkhoury, Fouad , Wegener, Dennis , Sylla, Karl-Heinz , Mock, Michael

Distributed (or federated) training of neural networks is an important approach to reduce the training time significantly. Previous experiments on communication efficient distributed learning have shown that model averaging, even if provably correct only in case of convex loss functions, is also working for the training of neural networks in some cases, however restricted to simple examples with relatively small standard data sets. In this paper, we investigate to what extent distributed communication efficient learning scales to huge data sets and complex, deep neural networks. We show how to integrate communication efficient distributed learning into the big data environment Spark and apply it to a complex realworld scenario, namely image segmentation on a large automotive data set (A2D2). We present evidence based results that the distributed approach scales successfully with increasing number of computing nodes in the case of fully convolutional networks.

No Thumbnail Available
Publication

Switching Dynamical Systems with Deep Neural Networks

2021-05-05 , Ojeda, César , Georgiev, Bogdan , Cvejoski, Kostadin , Schücker, Jannis , Bauckhage, Christian , Sánchez, Ramsés J.

The problem of uncovering different dynamical regimes is of pivotal importance in time series analysis. Switching dynamical systems provide a solution for modeling physical phenomena whose time series data exhibit different dynamical modes. In this work we propose a novel variational RNN model for switching dynamics allowing for both non-Markovian and nonlinear dynamical behavior between and within dynamic modes. Attention mechanisms are provided to inform the switching distribution. We evaluate our model on synthetic and empirical datasets of diverse nature and successfully uncover different dynamical regimes and predict the switching dynamics.

No Thumbnail Available
Publication

Knowledge Graph Based Question Answering System for Financial Securities

2021 , Bulla, Marius , Hillebrand, Lars Patrick , Lübbering, Max , Sifa, Rafet

Knowledge graphs offer a powerful framework to structure and represent financial information in a flexible way by describing real world entities, such as financial securities, and their interrelations in the form of a graph. Semantic question answering systems allow to retrieve information from a knowledge graph using natural language questions and thus eliminate the need to be proficient in a formal query language. In this work, we present a proof-of-concept design for a financial knowledge graph and with it a semantic question answering framework specifically targeted for the finance domain. Our implemented approach uses a span-based joint entity and relation extraction model with BERT embeddings to translate a single-fact natural language question into its corresponding formal query rep resentation. By employing a joint extraction model, we alleviate the concern of error propagation present in standard pipelined approaches for classification-based question answering. The presented framework is tested on a synthetic dataset derived from the instances of the implemented financial knowledge graph. Our empirical findings indicate very promising results with a F1-score of 84.60% for relation classification and 97.18% for entity detection.

No Thumbnail Available
Publication

Toxicity Detection in Online Comments with Limited Data: A Comparative Analysis

2021 , Lübbering, Max , Pielka, Maren , Das, Kajaree , Gebauer, Michael , Ramamurthy, Rajkumar , Bauckhage, Christian , Sifa, Rafet

We present a comparative study on toxicity detection, focusing on the problem of identifying toxicity types of low prevalence and possibly even unobserved at training time. For this purpose, we train our models on a dataset that contains only a weak type of toxicity, and test whether they are able to generalize to more severe toxicity types. We find that representation learning and ensembling exceed the classification performance of simple classifiers on toxicity detection, while also providing significantly better generalization and robustness. All models benefit from a larger training set size, which even extends to the toxicity types unseen during training.