Options
2015
Doctoral Thesis
Titel
Exploratory search in time-oriented primary data
Abstract
In a variety of research fields, primary data that describes scientific phenomena in an original condition is obtained. Time-oriented primary data, in particular, is an indispensable data type, derived from complex measurements depending on time. Today, time-oriented primary data is collected at rates that exceed the domain experts' abilities to seek valuable information undiscovered in the data. It is widely accepted that the magnitudes of uninvestigated data will disclose tremendous knowledge in data-driven research, provided that domain experts are able to gain insight into the data. Domain experts involved in data-driven research urgently require analytical capabilities. In scientific practice, predominant activities are the generation and validation of hypotheses. In analytical terms, these activities are often expressed in confirmatory and exploratory data analysis. Ideally, analytical support would combine the strengths of both types of activities. Exploratory Search (ES) ES is a concept that seamlessly includes information-seeking behaviors ranging from search to exploration. ES supports domain experts in both gaining an understanding of huge and potentially unknown data collections and the drill-down to relevant subsets, e.g., to validate hypotheses. As such, ES combines predominant tasks of domain experts applied to data-driven research. For the design of useful and usable ES systems (ESS), data scientists have to incorporate different sources of knowledge and technology. Of particular importance is the state-of-the-art in interactive data visualization and data analysis. Research in these factors is at heart of Information Visualization (IV) and Visual Analytics (VA). Approaches in IV and VA provide meaningful visualization and interaction designs, allowing domain experts to perform the information-seeking process in an effective and efficient way. Today, best-practice ESS almost exclusively exist for textual data content, e.g., put into practice in digital libraries to facilitate the reuse of digital documents. For time-oriented primary data, ES mainly remains at a theoretical state. Motivation and Problem Statement This thesis is motivated by two main assumptions. First, we expect that ES will have a tremendous impact on data-driven research for many research fields. In this thesis, we focus on time-oriented primary data, as a complex and important data type for data-driven research. Second, we assume that research conducted to IV and VA will particularly facilitate ES. For time-oriented primary data, however, novel concepts and techniques are required that enhance the design and the application of ESS. In particular, we observe a lack of methodological research in ESS for time-oriented primary data. In addition, the size, the complexity, and the quality of time-oriented primary data hampers the content-based access, as well as the design of visual interfaces for gaining an overview of the data content. Furthermore, the question arises how ESS can incorporate techniques for seeking relations between data content and metadata to foster data-driven research. Overarching challenges for data scientists are to create usable and useful designs, urgently requiring the involvement of the targeted user group and support techniques for choosing meaningful algorithmic models and model parameters. Throughout this thesis, we will resolve these challenges from conceptual, technical, and systemic perspectives. In turn, domain experts can benefit from novel ESS as a powerful analytical support to conduct data-driven research. Contribution In essence, our contributions cover the entire time series analysis process starting from accessing raw time-oriented primary data, processing and transforming time series data, to visual-interactive analysis of time series. We present visual search interfaces providing content-based access to time-oriented primary data. In a series of novel exploration-support techniques, we facilitate both gaining an overview of large and complex time-oriented primary data collections and seeking relations between data content and metadata. Throughout this thesis, we introduce VA as a means of designing effective and efficient visual-interactive systems. Our VA techniques empower data scientists to choose appropriate models and model parameters, as well as to involve users in the design. With both principles, we support the design of usable and useful interfaces which can be included into ESS. In this way, our contributions bridge the gap between search systems requiring exploration support and exploratory data analysis systems requiring visual querying capability. In the ESS presented in two case studies, we prove that our techniques and systems support data-driven research in an efficient and effective way.
;
Unterstützung datenzentrierter Forschung Die Dissertation Jürgen Bernards geht der Frage nach wie Forschern ein intuitiver und effektiver Zugang zu zeitbasierten Primärdaten gewährt werden kann, selbst wenn das Informationsbedürfnis der Forscher zunächst unbestimmt ist. Primärdaten beschreiben Phänomene in ihrer ursprünglichen Form und unterliegen damit keiner Veränderung oder Manipulation. Die Wiederverwendung von gespeicherten Primärdaten könnte auch andere Forscher an der datenzentrierten Forschung teilhaben lassen. Insbesondere zeitbasierte Daten sind häufig unwiederbringlich. Ihre Größe, Heterogenität, sowie ihr Zeitbezug stellt die datenzentrierte Forschung vor große Herausforderungen. Um unerforschtes Wissen abzurufen, bedarf es geeigneter Werkzeuge aus den Bereichen der konfirmativen und vor allem der explorativen Datenanalyse. Konzept der Explorativen Suche (ES) In seiner Arbeit setzt Bernard das Konzept der Explorativen Suche (ES) erstmals für zeitbasierte Primärdaten in die Praxis um. Grundsätzlich repräsentiert die ES die Idee, verschiedene Informationsbedürfnisse des Nutzers in einem System vereint zu unterstützen. Dabei sollen Aktivitäten vom Abrufen von Faktenwissen (Suche) bis hin zur Erkundung völlig neuer Such- und Informationsräume (Exploration) unterstützt werden. Bernard bedient sich hierfür der Techniken der Informationsvisualisierung und der Visual Analytics. Vorteile für den Nutzer Durch die Definition von visuell-interaktiven Suchanfragen (query-by-sketch, query-by-example) können Nutzer direkt im Dateninhalt suchen. Die visuell-interaktiven Überblicksdarstellungen versetzen die Nutzer zudem in der Lage, unbekannte Zusammenhänge im Suchraum zu erforschen und diese für die Wissenserweiterung zu nutzen. Die Öffnung des Designprozesses für den Nutzer sowie die strikt visuelle Art der Datenrepräsentierung sorgen für ein nutzerzentriertes Design und tragen außerdem zur Kommunikation von Information und Wissen aus zeitbasierten Primärdaten bei.
ThesisNote
Darmstadt, TU, Diss., 2015
Author(s)
Beteiligt
Verlagsort
Darmstadt