Now showing 1 - 3 of 3
  • Publication
    Anonymization of German financial documents using neural network-based language models with contextual word representations
    The automatization and digitalization of business processes have led to an increase in the need for efficient information extraction from business documents. However, financial and legal documents are often not utilized effectively by text processing or machine learning systems, partly due to the presence of sensitive information in these documents, which restrict their usage beyond authorized parties and purposes. To overcome this limitation, we develop an anonymization method for German financial and legal documents using state-of-the-art natural language processing methods based on recurrent neural nets and transformer architectures. We present a web-based application to anonymize financial documents and a large-scale evaluation of different deep learning techniques.
  • Publication
    Towards Intelligent Food Waste Prevention: An Approach Using Scalable and Flexible Harvest Schedule Optimization with Evolutionary Algorithms
    In times of climate change, growing world population, and the resulting scarcity of resources, efficient and economical usage of agricultural land is increasingly important and challenging at the same time. To avoid disadvantages of monocropping for soil and environment, it is advisable to practice intercropping of various plant species whenever possible. However, intercropping is challenging as it requires a balanced planting schedule due to individual cultivation time frames. Maintaining a continuous harvest throughout the season is important as it reduces logistical costs and related greenhouse gas emissions, and can also help to reduce food waste. Motivated by the prevention of food waste, this work proposes a flexible optimization method for a full harvest season of large crop ensembles that complies with given economical and environmental constraints. Our approach applies evolutionary algorithms and we further combine our evolution strategy with a sophisticated hierarchical loss function and adaptive mutation rate. We thus transfer the multi-objective into a pseudo-single-objective optimization problem, for which we obtain faster and better solutions than those of conventional approaches.
  • Publication
    Detecting and correcting spelling errors in high-quality Dutch Wikipedia text
    ( 2018)
    Beeksma, M.
    ;
    Gompel, M. van
    ;
    Kunneman, F.
    ;
    Onrust, L.
    ;
    Regnerus, B.
    ;
    Vinke, D.
    ;
    Brito, Eduardo
    ;
    ;
    For the CLIN28 shared task, we evaluated systems for spelling correction of high-quality text. The task focused on detecting and correcting spelling errors in Dutch Wikipedia pages. Three teams took part in the task. We compared the performance of their systems to that of a baseline system, the Dutch spelling corrector Valkuil. We evaluated the systems' performance in terms of F1 score. Although two of the three participating systems performed well in the task of correcting spelling errors, error detection proved to be a challenging task, and without exception resulted in a high false positive rate. Therefore, the F1 score of the baseline was not improved upon. This paper elaborates on each team's approach to the task, and discusses the overall challenges of correcting high-quality text.