Options
2010
Book Article
Title
Data sets created in ImageCLEF
Abstract
One of the main components of any Text REtrieval Conference (TREC)-style information retrieval benchmark is a collection of documents, such as images, texts, sounds or videos that is representative of a particular domain. Although many image collections exist both on-line and off-line, finding visual resources suitable for evaluation benchmarks such as ImageCLEF is challenging. For example, these resources are often expensive to purchase and subject to specific copyright licenses, restricting both the distribution and future access of such data for evaluation purposes. However, the various ImageCLEF evaluation tasks have managed to create and/or acquire almost a dozen document collections since 2003. This chapter begins by discussing the requirements and specifications for creating a suitable document collection for evaluating multi-modal and cross-lingual image retrieval systems. It then describes each of the eleven document collections created and used for ImageCLEF tasks between 2003 and 2009. The description includes the origins of each document collection, a summary of its content, as well as details regarding the distribution, benefits and limitations of each resource.