Now showing 1 - 3 of 3
  • Publication
    Tackling Contradiction Detection in German Using Machine Translation and End-to-End Recurrent Neural Networks
    Natural Language Inference, and specifically Contradiction Detection, is still an unexplored topic with respect to German text. In this paper, we apply Recurrent Neural Network (RNN) methods to learn contradiction-specific sentence embeddings. Our data set for evaluation is a machine-translated version of the Stanford Natural Language Inference (SNLI) corpus. The results are compared to a baseline using unsupervised vectorization techniques, namely tf-idf and Flair, as well as state-of-the art transformer-based (MBERT) methods. We find that the end-to-end models outperform the models trained on unsupervised embeddings, which makes them the better choice in an empirical use case. The RNN methods also perform superior to MBERT on the translated data set.
  • Publication
    Informed Machine Learning - A Taxonomy and Survey of Integrating Knowledge into Learning Systems
    Despite its great success, machine learning can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge into the training process, which leads to the notion of informed machine learning. In this paper, we present a structured overview of various approaches in this field. First, we provide a definition and propose a concept for informed machine learning, which illustrates its building blocks and distinguishes it from conventional machine learning. Second, we introduce a taxonomy that serves as a classification framework for informed machine learning approaches. It considers the source of knowledge, its representation, and its integration into the machine learning pipeline. Third, we survey related research and describe how different knowledge representations such as algebraic equations, logic rules, or simulation results can be used in learning systems. This evaluation of numerous papers on the basis of our taxonomy uncovers key methods in the field of informed machine learning.
  • Publication
    Towards Contradiction Detection in German: A Translation-Driven Approach
    With the recent advancements in Machine Learning based Natural Language Processing (NLP), language dependency has always been a limiting factor for a majority of NLP applications. Typically, models are trained for the English language due to the availability of very large labeled and unlabeled datasets, which also allow to fine tune models for that language. Contradiction Detection is one such problem that has found many practical applications in NLP and up to this point has only been studied in the context of English language. The scope of this paper is to examine a set of baseline methods for the Contradiction Detection task on German text. For this purpose, the well-known Stanford Natural Language Inference (SNLI) data set (110,000 sentence pairs) is machine-translated from English to German. We train and evaluate four classifiers on both the original and the translated data, using state-of-the-art textual data representations. Our main contribution is the first large-scale assessment for this problem in German, and a validation of machine translation as a data generation method. We also present a novel approach to learn sentence embeddings by exploiting the hidden states of an encoder-decoder Sequence-To-Sequence RNN trained for autoencoding or translation.