Options
Fraunhofer Institut für Integrierte Publikations-und Informationssysteme IPSI
Now showing
1 - 10 of 1310
-
PublicationEfficient entity resolution for large heterogeneous information spaces( 2011)
;Papadakis, G. ;Loannou, E. ;Niederée, C.Fankhauser, P.We have recently witnessed an enormous growth in the volume of structured and semi-structured data sets available on the Web. An important prerequisite for using and combining such data sets is the detection and merge of information that describes the same real-world entities, a task known as Entity Resolution. To make this quadratic task efficient, blocking techniques are typically employed. However, the high dynamics, loose schema binding, and heterogeneity of (semi-)structured data, impose new challenges to entity resolution. Existing blocking approaches become inapplicable because they rely on the homogeneity of the considered data and a-priory known schemata. In this paper, we introduce a novel approach for entity resolution, scaling it up for large, noisy, and heterogeneous information spaces. It combines an attribute-agnostic mechanism for building blocks with intelligent block processing techniques that boost blocks with high expected utility, propagate knowledg e about identified matches, and preempt the resolution process when it gets too expensive. Our extensive evaluation on real-world, large, heterogeneous data sets verifies that the suggested approach is both effective and efficient. Copyright 2011 ACM. -
PublicationLanguage models & topic models for personalizing tag recommendation( 2010)
;Krestel, R.Fankhauser, P.More and more content on the Web is generated by users. To organize this information and make it accessible via current search technology, tagging systems have gained tremendous popularity. Especially for multimedia content they allow to annotate resources with keywords (tags) which opens the door for classic text-based information retrieval. To support the user in choosing the right keywords, tag recommendation algorithms have emerged. In this setting, not only the content is decisive for recommending relevant tags but also the user's preferences. In this paper we introduce an approach to personalized tag recommendation that combines a probabilistic model of tags from the resource with tags from the user. As models we investigate simple language models as well as Latent Dirichlet Allocation. Extensive experiments on a real world dataset crawled from a big tagging system show that personalization improves tag recommendation, and our approach significantly outperforms st ate-of-the-art approaches. -
PublicationDivQ: Diversification for keyword search over structured databases( 2010)
;Demidova, E. ;Fankhauser, P. ;Zhou, X.Nejdl, W.Keyword queries over structured databases are notoriously ambiguous. No single interpretation of a keyword query can satisfy all users, and multiple interpretations may yield overlapping results. This paper proposes a scheme to balance the relevance and novelty of keyword search results over structured databases. Firstly, we present a probabilistic model which effectively ranks the possible interpretations of a keyword query over structured data. Then, we introduce a scheme to diversify the search results by re-ranking query interpretations, taking into account redundancy of query results. Finally, we propose -nDCG-W and WS-recall, an adaptation of -nDCG and S-recall metrics, taking into account graded relevance of subtopics. Our evaluation on two real-world datasets demonstrates that search results obtained using the proposed diversification algorithms better characterize possible answers available in the database than the results of the initial relevance ranking. -
PublicationThe missing links: Discovering hidden same-as links among a billion of triples( 2010)
;Papadakis, G. ;Demartini, G. ;Fankhauser, P.Kärger, P.The Semantic Web is constantly gaining momentum, as more and more Web sites and content providers adopt its principles. At the core of these principles lies the Linked Data movement, which demands that data on the Web shall be annotated and linked among different sources, instead of being isolated in data silos. In order to materialize this vision of a web of semantics, existing resource identifiers should be reused and shared between different Web sites. This is not always the case with the current state of the Semantic Web, since multiple identifiers are, more often than not, redundantly introduced for the same resources. In this paper we introduce a novel approach to automatically detect redundant identifiers solely by matching the URIs of information resources. The approach, based on a common pattern among Semantic Web URIs, provides a simple and practical method for duplicate detection. We apply this method on a large snapshot of the current Semantic Web comprising 1.15 billion statements and estimate the number of hidden duplicates in it. The outcomes of our experiments confirm the effectiveness as well as the efficiency of our method, and suggest that URI matching can be used as a scalable filter for discovering implicit same-as links. -
PublicationIt's all in the (ambient) environment: Designing experiences in ubiquitous hybrid worlds( 2008)Streitz, N.A.The objective of this invited talk is to present selected visions of ubiquitous computing and ambient communication based on the notion of the disappearing computer and to reflect on the resulting challenges for designing experiences in future smart environments. Our approach places the human at the centre of our design considerations and is based on exploiting the affordances of real objects by augmenting their physical properties with the potential of computer-based support. Combining the best of both worlds requires an integration of real and virtual worlds resulting in hybrid worlds. In this approach, the computer "disappears" and is almost " invisible" but its functionality is ubiquitously available and provides new forms of interacting with information. The approach can be summarized by the statement that "the world around us is the interface to information and for conveying experiences".
-
PublicationContext-oriented communication and the design of computer-supported discursive learning( 2008)
;Herrmann, T.Kienle, A.Computer-supported discursive learning (CSDL) systems for the support of asynchronous discursive learning need to fulfil specific socio-technical conditions. To understand these conditions, we employed design experiments combining aspects of communication theory, empirical findings, and continuous improvement of the investigated prototypes. Our theoretical perspective starts with a context-oriented model of communication which is-as a result of the experiments-extended by including the role of a third-party such as a facilitator. The theory-driven initial design requirements lead to the CSCL-prototype, KOLUMBUS, emphasizing the role of annotations. In KOLUMBUS, annotations can be immediately embedded in their context of learning material. Practical experience with the prototype in five cases reveals possibilities for implementing improvements and observing their impact. On this basis, we provide guidelines for the design of CSDL systems that focus on the support of asyn chronous discursive learning. -
PublicationFrom cognitive compatibility to the disappearing computer: Experience design for smart environments( 2008)Streitz, N.The objective of this keynote talk is to present selected visions of ambient and ubiquitous computing based on the notion of the disappearing computer and to reflect on the resulting challenges for designing experiences in future smart environments. It is a human-centred approach exploiting the affordances of real objects by augmenting their physical properties with the potential of computer-based support. In this approach, the computer "disappears" and is almost "invisible" but its functionality is ubiquitously available and provides new forms of interaction, communication and collaboration. In summary: the world around us is the interface to information and for conveying experiences.
-
PublicationFramework for combined video frame synchronization and watermark detection( 2007)
;Hauer, E. ;Bölke, T.Steinebach, M.Most of the MPEG watermarking schemes can only be embedded into I-frames. The other frames will not be marked. Different attacks like frame rate changing can change the frame type of the marked I-frames. Thus the watermark could be detected from wrong I-frames. Due to these attacks an important issue of digital watermarking solutions for MPEG video is the temporal synchronization of the video material to the proportions before the attacks to detect the watermark successfully. The synchronization information can be embed as part of the information watermark or as a second watermark. The weakness point is with the destruction of the synchronization information the watermark can not be detected more. We provide a solution which analyzes the I-frames based on a robust image hash system. The hash solution was developed for JPEG images and can also be used for MPEG I-frames because of their similar structure. The hash values are robust against common manipulations, like compr ession, and can be used to detect the marked frames also after manipulations at the video material. We analyze the usability of the image hash system and develop a concept based on video and MPEG properties. -
PublicationSmart artefacts as affordances for awareness in distributed teams( 2007)
;Streitz, N. ;Prante, T. ;Röcker, C. ;Alphen, D. van ;Stenzel, R. ;Magerkurth, C. ;Lahlou, S. ;Nosulenko, V. ;Jegou, F. ;Sonder, F.Plewe, D. -
PublicationThe integration of synchronous communication across dual interaction spaces( 2007)
;Mühlpfordt, M.Stahl, G.Dual interaction spaces - that combine text chat with a shared graphical work area - have been developed in recent years as CSCL applications to support the synchronous construction and discussion of shared artifacts by distributed small groups of students. However, the simple juxtaposition of the two spaces raises numerous issues for users: How can objects in the shared workspace be referenced from within the chat? How can users track and comprehend all the various simultaneous activities? How can participants coordinate their multifaceted actions? We present three steps toward integration of activities across separate interaction spaces: support for deictic references, implementation of a history feature and display of social awareness information.