• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. NAT: Noise-Aware Training for Robust Neural Sequence Labeling
 
  • Details
  • Full
Options
2020
Conference Paper
Title

NAT: Noise-Aware Training for Robust Neural Sequence Labeling

Abstract
Sequence labeling systems should perform reliably not only under ideal conditions but also with corrupted inputs as these systems often process user-generated text or follow an error-prone upstream component. To this end, we formulate the noisy sequence labeling problem, where the input may undergo an unknown noising process and propose two Noise-Aware Training (NAT) objectives that improve robustness of sequence labeling performed on perturbed input: Our data augmentation method trains a neural model using a mixture of clean and noisy samples, whereas our stability training algorithm encourages the model to create a noise-invariant latent representation. We employ a vanilla noise model at training time. For evaluation, we use both the original data and its variants perturbed with real O CR errors and misspellings. Extensive experiments on English and German named entity recognition benchmarks confirmed that NAT consistently improved robustness of popular sequence labeling models, preserving accuracy on the original input. We make our code and data publicly available for the research community.
Author(s)
Namysl, Marcin  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Behnke, Sven  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Köhler, Joachim  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Mainwork
ACL 2020, 58th Annual Meeting of the Association for Computational Linguistics. Proceedings. Online resource  
Conference
Association for Computational Linguistics (ACL Annual Meeting) 2020  
Open Access
DOI
10.24406/publica-fhg-408464
10.18653/v1/2020.acl-main.138
File(s)
N-596978.pdf (1.29 MB)
Rights
CC BY 4.0: Creative Commons Attribution
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Keyword(s)
  • robustness

  • sequence labeling

  • data augmentation

  • stability training

  • Named Entity Recognition (NER)

  • Optical Character Recognition (OCR)

  • information extraction

  • Natural Language Processing (NLP)

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024