• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Current language models’ poor performance on pragmatic aspects of natural language
 
  • Details
  • Full
Options
2023
Conference Paper
Title

Current language models’ poor performance on pragmatic aspects of natural language

Abstract
With the following system description, we present our approach for claim detection in tweets. We address both Subtask A, a binary sequence classification task, and Subtask B, a token classification task. For the first of the two subtasks, each input chunk - in this case, each tweet - was given a class label. For the second subtask, a label was assigned to each individual token in an input sequence. In order to match each utterance with the appropriate class label, we used pre-trained RoBERTa (A Robustly Optimized BERT Pretraining Approach) language models for sequence classification. Using the provided data and annotations as training data, we fine-tuned a model for each of the two classification tasks. Though the resulting models serve as adequate baseline models, the exploratory data analysis suggests fundamental problems in the structure of the training data. We argue that such tasks cannot be fully solved if pragmatic aspects of language are ignored. This type of information, often contextual and thus not explicitly stated in written language, is insufficiently represented in the current models. For this reason, we posit that the provided training data is under-specified and imperfectly suited to these classification tasks.
Author(s)
Pritzkau, Albert  
Fraunhofer-Institut für Kommunikation, Informationsverarbeitung und Ergonomie FKIE  
Waldmüller, Julia
University of the Bundeswehr Munich
Blanc, Olivier
University of the Bundeswehr Munich
Geierhos, Michaela
University of the Bundeswehr Munich
Schade, Ulrich  
Fraunhofer-Institut für Kommunikation, Informationsverarbeitung und Ergonomie FKIE  
Mainwork
Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, FIRE-WN 2023  
Conference
Forum for Information Retrieval Evaluation 2023  
Link
Link
Language
English
Fraunhofer-Institut für Kommunikation, Informationsverarbeitung und Ergonomie FKIE  
Keyword(s)
  • Information Extraction

  • Pragmatics

  • RoBERTa

  • Text Classification

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024