Now showing 1 - 10 of 20
  • Publication
    Efficient entity resolution for large heterogeneous information spaces
    ( 2011)
    Papadakis, G.
    ;
    Loannou, E.
    ;
    Niederée, C.
    ;
    Fankhauser, P.
    We have recently witnessed an enormous growth in the volume of structured and semi-structured data sets available on the Web. An important prerequisite for using and combining such data sets is the detection and merge of information that describes the same real-world entities, a task known as Entity Resolution. To make this quadratic task efficient, blocking techniques are typically employed. However, the high dynamics, loose schema binding, and heterogeneity of (semi-)structured data, impose new challenges to entity resolution. Existing blocking approaches become inapplicable because they rely on the homogeneity of the considered data and a-priory known schemata. In this paper, we introduce a novel approach for entity resolution, scaling it up for large, noisy, and heterogeneous information spaces. It combines an attribute-agnostic mechanism for building blocks with intelligent block processing techniques that boost blocks with high expected utility, propagate knowledg e about identified matches, and preempt the resolution process when it gets too expensive. Our extensive evaluation on real-world, large, heterogeneous data sets verifies that the suggested approach is both effective and efficient. Copyright 2011 ACM.
  • Publication
    Enabling a knowledge supply chain: From content resources to ontologies
    ( 2006)
    Stecher, R.
    ;
    Niederée, C.
    ;
    Bouquet, P.
    ;
    Jacquin, T.
    ;
    Aït-Mokhtar, S.
    ;
    Montemagni, S.
    ;
    Brunelli, R.
    ;
    Demetriou, G.
    Semantic annotation of content is a crucial building block of making the Semantic Web fly. The (semi-)automatic support of the underlying semantic knowledge supply chain requires contributions from different research disciplines and well-defined pipelines, which step-by-step create such annotations from raw content objects. This paper presents an annotation pipeline that has been designed and implemented as part of the VIKEF project. A clear structuring of the pipeline, the selection of adequate representation formats for the intermediate results (products) as well as for configuration information have been identified as crucial ingredients for an annotation pipeline, that enables the application-specific customization of the pipeline components and the flexible integration of upcoming advanced methods like new extraction methods into the pipeline.
  • Publication
    Exploiting lexical knowledge in learning user profiles for intelligent information access to digital collections
    ( 2005)
    Semeraro, G.
    ;
    Lops, P.
    ;
    Degemmis, M.
    ;
    Niederée, C.
    ;
    Stewart, A.
    Algorithms designed to support users in retrieving relevant information base their relevance computations on user profiles, in which representations of the users interests are maintained. This paper focuses on the use of supervised machine learning techniques to induce user profiles for Intelligent Information Access. The access must be personalized by profiles allowing users to retrieve information on the basis of conceptual content. To address this issue, we propose a method to learn sense-based user profiles based on WordNet, a lexical database.
  • Publication
    Modelle für semantische Web-Anwendungen
    ( 2005)
    Neuhold, E.J.
    ;
    Fuchs, M.
    ;
    Niederée, C.
  • Publication
    Ontologically-enriched unified user modeling for cross-system personalization
    ( 2005)
    Mehta, B.
    ;
    Niederée, C.
    ;
    Stewart, A.
    ;
    Degemmis, M.
    ;
    Lops, P.
    ;
    Semeraro, G.
    Personalization today has wide spread use on many Web sites. Systems and applications store preferences and information about users in order to provide personalized access. However, these systems store user profiles in proprietary formats. Although some of these systems store similar information about the user., exchange or reuse of information is not possible and information is duplicated. Additionally, since user profiles tend to be deeply buried inside such systems, users have little control over them. This paper proposes the use of a common ontology-based user context model as a basis for the exchange of user profiles between multiple systems and, thus, as a foundation for cross-system personalization.
  • Publication
    Understanding and tailoring your scientific information environment: A context-oriented view on e-science support
    ( 2005)
    Niederée, C.
    ;
    Stewart, A.
    ;
    Muscogiuri, C.
    ;
    Hemmje, M.
    ;
    Risse, T.
  • Publication
    Supporting information access in next generation digital library architectures
    ( 2005)
    Frommholz, I.
    ;
    Knezevic, P.
    ;
    Mehta, B.
    ;
    Niederée, C.
    ;
    Risse, T.
    ;
    Thiel, U.
    Current developments on Service-oriented Architectures, Peer-to-Peer and Grid computing promise more open and flexible architectures for digital libraries. They will open the Digital Library (DL) technology to a wider clientele, allow faster adaptability and enable the usage of federative models on content and service provision. These technologies raise new challenges for the realization of DL functionalities, which are rooted in the increased heterogeneity of content, services and metadata, in the higher degree of distribution and dynamics, as well as in the omission of a central control instance. This paper discusses these opportunities and challenges for three central types of DL functionality revolving around information access: metadata management, retrieval functionality, and personalization services.
  • Publication
    Supporting pattern-based application authoring for the semantic web
    ( 2004)
    Fuchs, M.
    ;
    Niederée, C.
    ;
    Hemmje, M.
  • Publication
    The role of context for information mediation in digital libraries
    ( 2004)
    Neuhold, E.J.
    ;
    Niederée, C.
    ;
    Stewart, A.
    ;
    Frommholz, I.
    ;
    Mehta, B.
    Mediating between available information objects and individual information needs is a central issue within the functionality of a digital library. In the simplest case this is an information request answered by a search engine based on an analysis of information objects within the digital library's information collection. However, neither the information access activity nor the information objects within the collection are isolated entities. They are both equipped with a multifaceted context. The invited talk, which is summarized by this paper, analyzes this context and discusses complementing approaches to make such context explicit and to use it for refining the mediation process within digital libraries.
  • Publication
    Extending your neighborhood-relationship-based recommendations using your personal web context
    ( 2004)
    Stewart, A.
    ;
    Niederée, C.
    ;
    Mehta, B.
    ;
    Hemmje, M.
    ;
    Neuhold, E.
    The people, documents, and other entities from a domain persons know, or are in other ways associated with, influence their decision making and the types of recommendations that serve them best. For example, recommending persons to meet in a conference or a paper to read from a digital library collection does not only depend on the task, interests, and skills of a user, but also on the persons and works they are already familiar with. In order for personalization services to reflect this dependency, extended user models that consider users' network of related domain entities in addition to other user characteristics, are required. Based on a unified context model, we present the Personal Web Context approach that models the typed relationships a user is involved in. Based on a Resource Network which can, for example, be built from the information collection and the associated meta data managed by a digital library, domain-specific rules are used to suggest valuable extensions of this "neighborhood" of a user. Such work can form the basis for new types of digital library services.