Semantic high-level features for automated cross-modal slideshow generation

Dittmar, C.; Dunker, P.; Begau, A.; Nowak, Stefanie; Gruhne, Matthias

doi:10.1109/CBMI.2009.32

2009

Conference Paper

Abstract

This paper describes a technical solution for automated slideshow generation by extracting a set of high-level features from music, such as beat grid, mood and genre and intelligently combining this set with image high-level features, such as mood, daytime- and scene classification. An advantage of this high-level concept is to enable the user to incorporate his preferences regarding the semantic aspects of music and images. For example, the user might request the system to automatically create a slideshow, which plays soft music and shows pictures with sunsets from the last 10 years of his own photo collection.The high-level feature extraction on both, the audio and the visual information is based on the same underlying machine learning core, which processes different audio- and visual-low- and mid-level features. This paper describes the technical realization and evaluation of the algorithms with suitable test databases.

Author(s)

Dittmar, C.

Dunker, P.

Begau, A.

Nowak, Stefanie

Gruhne, Matthias

Mainwork

International Workshop on Content-Based Multimedia Indexing, CBMI 2009. Proceedings

Conference

International Workshop on Content-Based Multimedia Indexing (CBMI) 2009

Options

Semantic high-level features for automated cross-modal slideshow generation