Now showing 1 - 10 of 12
  • Publication
    Unsupervised duplicate detection using sample non-duplicates
    ( 2006)
    Lehti, P.
    ;
    Fankhauser, P.
    The problem of identifying objects in databases that refer to the same real world entity, is known, among others, as duplicate detection or record linkage. Objects may be duplicates, even though they are not identical due to errors and missing data. Typical current methods require deep understanding of the application domain or a good representative training set, which entails significant costs. In this paper we present an unsupervised, domain independent approach to duplicate detection that starts with a broad alignment of potential duplicates, and analyses the distribution of observed similarity values among these potential duplicates and among representative sample non-duplicates to improve the initial alignment. Additionally, the presented approach is not only able to align flat records, but makes also use of related objects, which may significantly increase the alignment accuracy. Evaluations show that our approach supersedes other unsupervised approaches and reaches almost the same accuracy as even fully supervised, domain dependent approaches.
  • Publication
    An overview on automatic capacity planning
    ( 2005)
    Risse, T.
    The performance requirement for the transformation of messages within electronic business processes is our motivation to investigate in automatic capacity planning methods. Performance typically means the throughput and response time of a system. Finding a configuration of a distributed system satisfying performance goals is a complex search problem that involves many design parameters, like hardware selection, job distribution and process configuration. Performance models are a powerful tool to analyse potential system configurations, however, their evaluation is expensive, such that only a limited number of possible configurations can be evaluated. In this paper we give an overview of our automatic system design method and discuss the arising problems to achieve the performance during the runtime of the systems. Furthermore we make a discussion on the impact of our strategy on the current trends in distributed systems.
  • Publication
    Queries in context: Access to digitized historic documents in a collaboratory for the humanities
    ( 2005)
    Thiel, U.
    ;
    Brocks, H.
    ;
    Dirsch-Weigand, A.
    ;
    Everts, A.
    ;
    Frommholz, I.
    ;
    Stein, A.
    In contrast to standard digital libraries, systems addressing the specific requirements of cultural heritage need to deal with digitized material like scanned documents instead of home digital items. Such systems aim at providing the means for domain experts, e.g. historians, to collaboratively work with the given material. To support their work, automatic indexing mechanisms for both textual and pictorial digitized documents need to be combined with retrieval methods exploiting the content as well as the context of information items for precise searches. In the COLLATE project we devised several access methods using textual contents, feature extraction from images, metadata, and annotations provided by the users.
  • Publication
    Understanding and tailoring your scientific information environment: A context-oriented view on e-science support
    ( 2005)
    Niederée, C.
    ;
    Stewart, A.
    ;
    Muscogiuri, C.
    ;
    Hemmje, M.
    ;
    Risse, T.
  • Publication
    Cooperation in ubiquitous computing
    ( 2005)
    Tandler, P.
    ;
    Dietz, L.
    Many ubiquitous computing scenarios deal with cooperative work situations. To successfully support these situations, computer-supported cooperative work (CSCW) concepts and technologies face new challenges. One of the most fundamental concepts for cooperation is sharing. By analyzing applications of sharing in the context of ubiquitous computing it can be shown that ubiquitous computing enables an extended view on sharing. In this paper, we show that this extended view seamlessly integrates the view of "traditional" CSCW and additionally incorporates ubiquitous, heterogeneous, and mobile devices used in a common context.
  • Publication
    Modelling interactive, three-dimensional information visualizations
    ( 2005)
    Jäschke, G.
    ;
    Gupta, P.
    ;
    Hemmje, M.
    Research on information visualization has so far established an outline of the information visualization process and shed light on a broad range of detail aspects involved. However, there is no model in place that describes the nature of information visualization in a coherent, detailed, and well-defined way. We believe that the lack of such a lingua franca hinders communication on and application of information visualization techniques. Our approach is to design a declarative language for describing and defining information visualization techniques. The information visualization modelling language (IVML) provides a means to formally express, note, preserve, and communicate structure, appearance, behaviour, and functionality of information visualization techniques and applications in a standardized way. The anticipated benefits comprise both application and theory.
  • Publication
    Secure production of digital media
    ( 2005)
    Steinebach, M.C.
    ;
    Dittmann, J.
    Today more and more media data is produced completely in the digital domain without the need of analogue input. This brings an increase of flexibility and efficiency in media handling, as distributed access, duplication and modification are possible without the need to move or touch physical data carriers. But this also reduces the security of the process: Without physical originals to refer to, changes in the material can remain unnoticed, at the end making the manipulated data the new original. Theft and illegal copies in the digital domain can happen without notice and loss of quality. We therefore see the need of setting up secure media production environments, where access control, integrity and copyright protection as well as traceability of individual copies are enabled. Addressing this need, we design a framework for media production environments, where mechanisms like encryption, digital signatures and digital watermarking help to enable a flexible yet secure handling and processing of the content.
  • Publication
    Enterprise information integration
    ( 2005)
    Kamps, T.
    ;
    Stenzel, R.
    ;
    Chen, L.
    ;
    Rostek, L.
  • Publication
    From human-computer interaction to human-artefact interaction: Interaction design for smart environments
    ( 2005)
    Streitz, N.A.
    The introduction of computer technology caused a shift away from real objects as sources of information towards desktop computers as the interfaces to information now (re)presented in a digital for-mat. In this paper, I will argue for returning to the real world as the starting point for designing information and communication environments. Our approach is to design environments that exploit the affordances of real world objects and at the same time use the potential of computer-based support. Thus, we move from human-computer interaction to human-artefact interaction. Combining the best of both worlds requires an integration of real and virtual worlds resulting in hybrid worlds. The approach will be demonstrated by sample prototypes we have built as, e.g., the Roomware (R) components and smart artefacts that were developed in the project "Ambient Agoras: Dynamic Information Clouds in a Hybrid World" which was part of the EU-ftinded proactive initiative "The Disappearing Computer"(DC).