Coverage and diversity aware Top-k query for spatio-temporal posts
Large amounts of user-generated content are posted daily on the Web, including textual, spatial and temporal information. Exploiting this content to detect, analyze and monitor events and topics that have a potentially large span in space and time requires eficient retrieval and ranking based on criteria including all three dimensions. In this paper, we introduce a novel type of spatial-temporal-keyword query that combines keyword search with the task of maximizing the spatio-temporal coverage and diversity of the returned top-k results. We first describe a baseline algorithm based on related search results diversification problems. Then, we develop an eficient approach which exploits a hybrid spatial-temporal-keyword index to drastically reduce query execution time. To that end, we extend two state-of-the-art indices for top-k spatio-textual queries and describe how our proposed approach can be applied on top of them. We evaluate the efficiency of our algorithms by conducting experiments on two large, real-world datasets containing geo tagged tweets and photos.