Entity recognition in information extraction

Hanafiah, Novita; Quix, Christoph

doi:10.1007/978-3-319-05476-6_12

2014

Conference Paper

Abstract

Detecting and resolving entities is an important step in information retrieval applications. Humans are able to recognize entities by context, but information extraction systems (IES) need to apply sophisticated algorithms to recognize an entity. The development and implementation of an entity recognition algorithm is described in this paper. The implemented system is integrated with an IES that derives triples from unstructured text. By doing so, the triples are more valuable in query answering because they refer to identified entities. By extracting the information from Wikipedia encyclopedia, a dictionary of entities and their contexts is built. The entity recognition computes a score for context similarity which is based on cosine similarity with a tf-idf weighting scheme and the string similarity. The implemented system shows a good accuracy on Wikipedia articles, is domain independent, and recognizes entities of arbitrary types.

Author(s)

Hanafiah, Novita

Quix, Christoph

Mainwork

Intelligent information and database systems. 6th Asian conference, ACIIDS 2014. Vol.1

Conference

Asian Conference "Intelligent Information and Database Systems" (ACIIDS) 2013

Options

Entity recognition in information extraction