Now showing 1 - 10 of 130
  • Publication
    Anonymization of German financial documents using neural network-based language models with contextual word representations
    The automatization and digitalization of business processes have led to an increase in the need for efficient information extraction from business documents. However, financial and legal documents are often not utilized effectively by text processing or machine learning systems, partly due to the presence of sensitive information in these documents, which restrict their usage beyond authorized parties and purposes. To overcome this limitation, we develop an anonymization method for German financial and legal documents using state-of-the-art natural language processing methods based on recurrent neural nets and transformer architectures. We present a web-based application to anonymize financial documents and a large-scale evaluation of different deep learning techniques.
  • Publication
    An Optimization for Convolutional Network Layers Using the Viola-Jones Framework and Ternary Weight Networks
    Neural networks have the potential to be extremely powerful for computer vision related tasks, but can be computationally expensive. Classical methods, by comparison, tend to be relatively light weight, albeit not as powerful. In this paper, we propose a method of combining parts from a classical system, called the Viola-Jones Object Detection Framework, with a modern ternary neural network to improve the efficiency of a convolutional neural net by replacing convolutional filters with a set of custom ones inspired by the framework. This reduces the number of operations needed for computing feature values with negligible effects on overall accuracy, allowing for a more optimized network.
  • Publication
    Decoupling Autoencoders for Robust One-vs-Rest Classification
    One-vs-Rest (OVR) classification aims to distinguish a single class of interest from other classes. The concept of novelty detection and robustness to dataset shift becomes crucial in OVR when the scope of the rest class extends from the classes observed during training to unseen and possibly unrelated classes. In this work, we propose a novel architecture, namely Decoupling Autoencoder (DAE) to tackle the common issue of robustness w.r.t. out-of-distribution samples which is prevalent in classifiers such as multi-layer perceptrons (MLP) and ensemble architectures. Experiments on plain classification, outlier detection, and dataset shift tasks show DAE to achieve robust performance across these tasks compared to the baselines, which tend to fail completely, when exposed to dataset shift. W hile DAE and the baselines yield rather uncalibrated predictions on the outlier detection and dataset shift task, we found that DAE calibration is more stable across all tasks. Therefore, calibration measures applied to the classification task could also improve the calibration of the outlier detection and dataset shift scenarios for DAE.
  • Publication
    ALiBERT: Improved automated list inspection (ALI) with BERT
    ( 2021-08-16) ; ;
    Stenzel, Marc Robin
    ;
    ; ;
    Khameneh, Tim Dilmaghani
    ;
    Warning, Ulrich
    ;
    Kliem, Bernd
    ;
    Loitz, Rüdiger
    We consider Automated List Inspection (ALI), a content-based text recommendation system that assists auditors in matching relevant text passages from notes in financial statements to specific law regulations. ALI follows a ranking paradigm in which a fixed number of requirements per textual passage are shown to the user. Despite achieving impressive ranking performance, the user experience can still be improved by showing a dynamic number of recommendations. Besides, existing models rely on a feature-based language model that needs to be pre-trained on a large corpus of domain-specific datasets. Moreover, they cannot be trained in an end-to-end fashion by jointly optimizing with language model parameters. In this work, we alleviate these concerns by considering a multi-label classification approach that predicts dynamic requirement sequences. We base our model on pre-trained BERT that allows us to fine-tune the whole model in an end-to-end fashion, thereby avoiding the need for training a language representation model. We conclude by presenting a detailed evaluation of the proposed model on two German financial datasets.
  • Publication
    Decision Snippet Features
    ( 2021-05-05)
    Welke, Pascal
    ;
    Alkhoury, Fouad
    ;
    ;
    Decision trees excel at interpretability of their prediction results. To achieve required prediction accuracies, however, often large ensembles of decision trees random forests are considered, reducing interpretability due to large size. Additionally, their size slows down inference on modern hardware and restricts their applicability in low-memory embedded devices. We introduce Decision Snippet Features, which are obtained from small subtrees that appear frequently in trained random forests. We subsequently show that linear models on top of these features achieve comparable and sometimes even better predictive performance than the original random forest, while reducing the model size by up to two orders of magnitude.
  • Publication
    Switching Dynamical Systems with Deep Neural Networks
    The problem of uncovering different dynamical regimes is of pivotal importance in time series analysis. Switching dynamical systems provide a solution for modeling physical phenomena whose time series data exhibit different dynamical modes. In this work we propose a novel variational RNN model for switching dynamics allowing for both non-Markovian and nonlinear dynamical behavior between and within dynamic modes. Attention mechanisms are provided to inform the switching distribution. We evaluate our model on synthetic and empirical datasets of diverse nature and successfully uncover different dynamical regimes and predict the switching dynamics.
  • Publication
    Auto Encoding Explanatory Examples with Stochastic Paths
    In this paper we ask for the main factors that determine a classifiers decision making process and uncover such factors by studying latent codes produced by auto-encoding frameworks. To deliver an explanation of a classifiers behaviour, we propose a method that provides series of examples highlighting semantic differences between the classifiers decisions. These examples are generated through interpolations in latent space. We introduce and formalize the notion of a semantic stochastic path, as a suitable stochastic process defined in feature (data) space via latent code interpolations. We then introduce the concept of semantic Lagrangians as a way to incorporate the desired classifiers behaviour and find that the solution of the associated variational problem allows for highli ghting differences in the classifier decision. Very importantly, within our framework the classifier is used as a black-box, and only its evaluation is required.
  • Publication
    Utilizing Representation Learning for Robust Text Classification Under Datasetshift
    Within One-vs-Rest (OVR) classification, a classifier differentiates a single class of interest (COI) from the rest, i.e. any other class. By extending the scope of the rest class to corruptions (dataset shift), aspects of outlier detection gain relevancy. In this work, we show that adversarially trained autoencoders (ATA) representative of autoencoder-based outlier detection methods, yield tremendous robustness improvements over traditional neural network methods such as multi-layer perceptrons (MLP) and common ensemble methods, while maintaining a competitive classification performance. In contrast, our results also reveal that deep learning methods solely optimized for classification, tend to fail completely when exposed to dataset shift.
  • Publication
    Learning Deep Generative Models for Queuing Systems
    Modern society is heavily dependent on large scale client-server systems with applications ranging from Internet and Communication Services to sophisticated logistics and deployment of goods. To maintain and improve such a system, a careful study of client and server dynamics is needed e.g. response/service times, aver-age number of clients at given times, etc. To this end, one traditionally relies, within the queuing theory formalism, on parametric analysis and explicit distribution forms. However, parametric forms limit the models expressiveness and could struggle on extensively large datasets. We propose a novel data-driven approach towards queuing systems: the Deep Generative Service Times. Our methodology delivers a flexible and scalable model for service and response times. We leverage the representation capabilities of Recurrent Marked Point Processes for the temporal dynamics of clients, as well as Wasserstein Generative Adversarial Network techniques, to learn deep generative models which are able to represent complex conditional service time distributions. We provide extensive experimental analysis on both empirical and synthetic datasets, showing the effectiveness of the proposed models.