Options
Fraunhofer Institut für Integrierte Publikations-und Informationssysteme IPSI
Now showing
1 - 10 of 119
-
PublicationUnsupervised duplicate detection using sample non-duplicates( 2006)
;Lehti, P.Fankhauser, P.The problem of identifying objects in databases that refer to the same real world entity, is known, among others, as duplicate detection or record linkage. Objects may be duplicates, even though they are not identical due to errors and missing data. Typical current methods require deep understanding of the application domain or a good representative training set, which entails significant costs. In this paper we present an unsupervised, domain independent approach to duplicate detection that starts with a broad alignment of potential duplicates, and analyses the distribution of observed similarity values among these potential duplicates and among representative sample non-duplicates to improve the initial alignment. Additionally, the presented approach is not only able to align flat records, but makes also use of related objects, which may significantly increase the alignment accuracy. Evaluations show that our approach supersedes other unsupervised approaches and reaches almost the same accuracy as even fully supervised, domain dependent approaches. -
PublicationQueries in context: Access to digitized historic documents in a collaboratory for the humanities( 2005)
;Thiel, U. ;Brocks, H. ;Dirsch-Weigand, A. ;Everts, A. ;Frommholz, I.Stein, A.In contrast to standard digital libraries, systems addressing the specific requirements of cultural heritage need to deal with digitized material like scanned documents instead of home digital items. Such systems aim at providing the means for domain experts, e.g. historians, to collaboratively work with the given material. To support their work, automatic indexing mechanisms for both textual and pictorial digitized documents need to be combined with retrieval methods exploiting the content as well as the context of information items for precise searches. In the COLLATE project we devised several access methods using textual contents, feature extraction from images, metadata, and annotations provided by the users. -
PublicationUnderstanding and tailoring your scientific information environment: A context-oriented view on e-science support( 2005)
;Niederée, C. ;Stewart, A. ;Muscogiuri, C. ;Hemmje, M.Risse, T. -
PublicationCooperation in ubiquitous computing( 2005)
;Tandler, P.Dietz, L.Many ubiquitous computing scenarios deal with cooperative work situations. To successfully support these situations, computer-supported cooperative work (CSCW) concepts and technologies face new challenges. One of the most fundamental concepts for cooperation is sharing. By analyzing applications of sharing in the context of ubiquitous computing it can be shown that ubiquitous computing enables an extended view on sharing. In this paper, we show that this extended view seamlessly integrates the view of "traditional" CSCW and additionally incorporates ubiquitous, heterogeneous, and mobile devices used in a common context. -
PublicationOntologically-enriched unified user modeling for cross-system personalization( 2005)
;Mehta, B. ;Niederée, C. ;Stewart, A. ;Degemmis, M. ;Lops, P.Semeraro, G.Personalization today has wide spread use on many Web sites. Systems and applications store preferences and information about users in order to provide personalized access. However, these systems store user profiles in proprietary formats. Although some of these systems store similar information about the user., exchange or reuse of information is not possible and information is duplicated. Additionally, since user profiles tend to be deeply buried inside such systems, users have little control over them. This paper proposes the use of a common ontology-based user context model as a basis for the exchange of user profiles between multiple systems and, thus, as a foundation for cross-system personalization. -
PublicationAn overview on automatic capacity planning( 2005)Risse, T.The performance requirement for the transformation of messages within electronic business processes is our motivation to investigate in automatic capacity planning methods. Performance typically means the throughput and response time of a system. Finding a configuration of a distributed system satisfying performance goals is a complex search problem that involves many design parameters, like hardware selection, job distribution and process configuration. Performance models are a powerful tool to analyse potential system configurations, however, their evaluation is expensive, such that only a limited number of possible configurations can be evaluated. In this paper we give an overview of our automatic system design method and discuss the arising problems to achieve the performance during the runtime of the systems. Furthermore we make a discussion on the impact of our strategy on the current trends in distributed systems.
-
PublicationPervasive games: Bringing computer entertainment back to the real world( 2005)
;Magerkurth, C. ;Cheok, A.D. ;Mandryk, R.L.Nilsen, T. -
PublicationFlexible notifications and task models for cooperative work management( 2005)
;Rubart, J.Richter, H.Knowledge intensive cooperative work requires emergent workflow management. Participants interact with the workflow engine and jointly redefine and activate workflow structure. To improve the usability of such systems we present reconfigurable notification mechanisms as well as shared task models that can be used from diverse clients at the same time focusing on different kinds of visualization and navigation. -
-
PublicationModelling interactive, three-dimensional information visualizations( 2005)
;Jäschke, G. ;Gupta, P.Hemmje, M.Research on information visualization has so far established an outline of the information visualization process and shed light on a broad range of detail aspects involved. However, there is no model in place that describes the nature of information visualization in a coherent, detailed, and well-defined way. We believe that the lack of such a lingua franca hinders communication on and application of information visualization techniques. Our approach is to design a declarative language for describing and defining information visualization techniques. The information visualization modelling language (IVML) provides a means to formally express, note, preserve, and communicate structure, appearance, behaviour, and functionality of information visualization techniques and applications in a standardized way. The anticipated benefits comprise both application and theory.