• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. MESD: Metadata Extraction from Scholarly Documents - A Shared Task Overview
 
  • Details
  • Full
Options
2025
Conference Paper
Title

MESD: Metadata Extraction from Scholarly Documents - A Shared Task Overview

Abstract
This paper presents an overview of the Metadata Extraction from Scholarly Documents (MESD) shared task, which was designed to address the challenge of extracting structured metadata (e.g. Title, Author, Abstract, etc.) from scientific publications. The task aimed to promote the development of techniques for making scholarly data more Findable, Accessible, Interoperable, Reusable (FAIR) by improving metadata extraction from PDF documents. We describe the task design and the creation of two complementary datasets: (1) the S2ORC_Exp500v1 dataset consisting of 500 training samples, 100 validation samples, and 100 test samples with text-based annotations, and (2) the SSOAR Multidisciplinary Vision Dataset (SSOARGMVD) containing more than 8000 documents with bounding box annotations suitable for computer vision approaches. We discuss potential directions for future research in metadata extraction from scholarly documents, highlighting the opportunities presented by these new resources.
Author(s)
Boukhers, Zeyd  
Fraunhofer-Institut für Angewandte Informationstechnik FIT  
Yang, Cong
Soochow University
Mainwork
Joint Proceedings of the ESWC 2025 Workshops and Tutorials  
Conference
Extended Semantic Web Conference 2025  
International Workshop on Natural Scientific Language Processing and Research Knowledge Graphs 2025  
Link
Link
Language
English
Fraunhofer-Institut für Angewandte Informationstechnik FIT  
Keyword(s)
  • document processing

  • metadata extraction

  • natural language processing

  • scholarly documents

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024