• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Noise Reduction in Distant Supervision for Relation Extraction Using Probabilistic Soft Logic
 
  • Details
  • Full
Options
2019
Conference Paper
Title

Noise Reduction in Distant Supervision for Relation Extraction Using Probabilistic Soft Logic

Abstract
The performance of modern relation extraction systems is to a great degree dependent on the size and quality of the underlying training corpus and in particular on the labels. Since generating these labels by human annotators is expensive, Distant Supervision has been proposed to automatically align entities in a knowledge base with a text corpus to generate annotations. However, this approach suffers from introducing noise, which negatively affects the performance of relation extraction systems. To tackle this problem, we propose a probabilistic graphical model which simultaneously incorporates different sources of knowledge such as domain experts knowledge about the context and linguistic knowledge about the sentence structure in a principled way. The model is defined using the declarati ve language provided by Probabilistic Soft Logic. Experimental results show that the proposed approach, compared to the original distantly supervised set, not only improves the quality of such generated training data sets, but also the performance of the final relation extraction model. The performance of modern relation extraction systems is to a great degree dependent on the size and quality of the underlying training corpus and in particular on the labels. Since generating these labels by human annotators is expensive, Distant Supervision has been proposed to automatically align entities in a knowledge base with a text corpus to generate annotations. However, this approach suffers from introducing noise, which negatively affects the performance of relation extraction systems. To tackle this problem, we propose a probabilistic graphical model which simultaneously incorporates different sources of knowledge such as domain experts knowledge about the context and linguistic knowledge about the sentence structure in a principled way. The model is defined using the declarati ve language provided by Probabilistic Soft Logic. Experimental results show that the proposed approach, compared to the original distantly supervised set, not only improves the quality of such generated training data sets, but also the performance of the final relation extraction model.
Author(s)
Kirsch, Birgit  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Niyazova, Zamira
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Mock, Michael  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Rüping, Stefan  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Mainwork
Machine Learning and Knowledge Discovery in Databases. Proceedings. Pt.II  
Project(s)
ML2R
Funder
Bundesministerium für Bildung und Forschung BMBF (Deutschland)  
Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) 2019  
DOI
10.1007/978-3-030-43887-6_6
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Keyword(s)
  • Probabilistic Soft Logic

  • Statistical Relational Learning

  • Distant Supervision

  • Relation Extraction

  • Natural Language Processing

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024