Now showing 1 - 10 of 20
No Thumbnail Available
Publication

Efficient entity resolution for large heterogeneous information spaces

2011 , Papadakis, G. , Loannou, E. , Niederée, C. , Fankhauser, P.

We have recently witnessed an enormous growth in the volume of structured and semi-structured data sets available on the Web. An important prerequisite for using and combining such data sets is the detection and merge of information that describes the same real-world entities, a task known as Entity Resolution. To make this quadratic task efficient, blocking techniques are typically employed. However, the high dynamics, loose schema binding, and heterogeneity of (semi-)structured data, impose new challenges to entity resolution. Existing blocking approaches become inapplicable because they rely on the homogeneity of the considered data and a-priory known schemata. In this paper, we introduce a novel approach for entity resolution, scaling it up for large, noisy, and heterogeneous information spaces. It combines an attribute-agnostic mechanism for building blocks with intelligent block processing techniques that boost blocks with high expected utility, propagate knowledg e about identified matches, and preempt the resolution process when it gets too expensive. Our extensive evaluation on real-world, large, heterogeneous data sets verifies that the suggested approach is both effective and efficient. Copyright 2011 ACM.

No Thumbnail Available
Publication

Modelle für semantische Web-Anwendungen

2005 , Neuhold, E.J. , Fuchs, M. , Niederée, C.

No Thumbnail Available
Publication

Exploiting lexical knowledge in learning user profiles for intelligent information access to digital collections

2005 , Semeraro, G. , Lops, P. , Degemmis, M. , Niederée, C. , Stewart, A.

Algorithms designed to support users in retrieving relevant information base their relevance computations on user profiles, in which representations of the users interests are maintained. This paper focuses on the use of supervised machine learning techniques to induce user profiles for Intelligent Information Access. The access must be personalized by profiles allowing users to retrieve information on the basis of conceptual content. To address this issue, we propose a method to learn sense-based user profiles based on WordNet, a lexical database.

No Thumbnail Available
Publication

The role of context for information mediation in digital libraries

2004 , Neuhold, E.J. , Niederée, C. , Stewart, A. , Frommholz, I. , Mehta, B.

Mediating between available information objects and individual information needs is a central issue within the functionality of a digital library. In the simplest case this is an information request answered by a search engine based on an analysis of information objects within the digital library's information collection. However, neither the information access activity nor the information objects within the collection are isolated entities. They are both equipped with a multifaceted context. The invited talk, which is summarized by this paper, analyzes this context and discusses complementing approaches to make such context explicit and to use it for refining the mediation process within digital libraries.

No Thumbnail Available
Publication

Enabling a knowledge supply chain: From content resources to ontologies

2006 , Stecher, R. , Niederée, C. , Bouquet, P. , Jacquin, T. , Aït-Mokhtar, S. , Montemagni, S. , Brunelli, R. , Demetriou, G.

Semantic annotation of content is a crucial building block of making the Semantic Web fly. The (semi-)automatic support of the underlying semantic knowledge supply chain requires contributions from different research disciplines and well-defined pipelines, which step-by-step create such annotations from raw content objects. This paper presents an annotation pipeline that has been designed and implemented as part of the VIKEF project. A clear structuring of the pipeline, the selection of adequate representation formats for the intermediate results (products) as well as for configuration information have been identified as crucial ingredients for an annotation pipeline, that enables the application-specific customization of the pipeline components and the flexible integration of upcoming advanced methods like new extraction methods into the pipeline.

No Thumbnail Available
Publication

Understanding and tailoring your scientific information environment: A context-oriented view on e-science support

2005 , Niederée, C. , Stewart, A. , Muscogiuri, C. , Hemmje, M. , Risse, T.

No Thumbnail Available
Publication

Supporting information access in next generation digital library architectures

2004 , Frommholz, I. , Knezevic, P. , Mehta, B. , Niederée, C. , Risse, W. , Thiel, U.

No Thumbnail Available
Publication

Ontologically-enriched unified user modeling for cross-system personalization

2005 , Mehta, B. , Niederée, C. , Stewart, A. , Degemmis, M. , Lops, P. , Semeraro, G.

Personalization today has wide spread use on many Web sites. Systems and applications store preferences and information about users in order to provide personalized access. However, these systems store user profiles in proprietary formats. Although some of these systems store similar information about the user., exchange or reuse of information is not possible and information is duplicated. Additionally, since user profiles tend to be deeply buried inside such systems, users have little control over them. This paper proposes the use of a common ontology-based user context model as a basis for the exchange of user profiles between multiple systems and, thus, as a foundation for cross-system personalization.

No Thumbnail Available
Publication

Supporting information access in next generation digital library architectures

2005 , Frommholz, I. , Knezevic, P. , Mehta, B. , Niederée, C. , Risse, T. , Thiel, U.

Current developments on Service-oriented Architectures, Peer-to-Peer and Grid computing promise more open and flexible architectures for digital libraries. They will open the Digital Library (DL) technology to a wider clientele, allow faster adaptability and enable the usage of federative models on content and service provision. These technologies raise new challenges for the realization of DL functionalities, which are rooted in the increased heterogeneity of content, services and metadata, in the higher degree of distribution and dynamics, as well as in the omission of a central control instance. This paper discusses these opportunities and challenges for three central types of DL functionality revolving around information access: metadata management, retrieval functionality, and personalization services.

No Thumbnail Available
Publication

Digital libraries in knowledge management: An e-learning case study

2004 , Fuchs, M. , Muscogiuri, C. , Niederée, C. , Hemmje, M.