Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

LC-QuAD: A corpus for complex question answering over knowledge graphs

 
: Trivedi, Priyansh; Maheshwari, Gaurav; Dubey, Mohnish; Lehmann, Jens

:

D'Amato, C.:
The Semantic Web - ISWC 2017 : 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017. Proceedings, Part II
Cham: Springer International Publishing, 2017 (Lecture Notes in Computer Science 10588)
ISBN: 978-3-319-68204-4 (electronic)
ISBN: 978-3-319-68203-7 (print)
S.210-218
International Semantic Web Conference (ISWC) <16, 2017, Vienna>
Englisch
Konferenzbeitrag
Fraunhofer IAIS ()

Abstract
Being able to access knowledge bases in an intuitive way has been an active area of research over the past years. In particular, several question answering (QA) approaches which allow to query RDF datasets in natural language have been developed as they allow end users to access knowledge without needing to learn the schema of a knowledge base and learn a formal query language. To foster this research area, several training datasets have been created, e.g. in the QALD (Question Answering over Linked Data) initiative. However, existing datasets are insufficient in terms of size, variety or complexity to apply and evaluate a range of machine learning based QA approaches for learning complex SPARQL queries. With the provision of the Large-Scale Complex Question Answering Dataset (LC-QuAD), we close this gap by providing a dataset with 5000 questions and their corresponding SPARQL queries over the DBpedia dataset. In this article, we describe the dataset creation process an d how we ensure a high variety of questions, which should enable to assess the robustness and accuracy of the next generation of QA systems for knowledge graphs.

: http://publica.fraunhofer.de/dokumente/N-473837.html