Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Semantic high-level features for automated cross-modal slideshow generation

: Dunker, P.; Dittmar, C.; Begau, A.; Nowak, Stefanie; Gruhne, Matthias


Institute of Electrical and Electronics Engineers -IEEE-:
International Workshop on Content-Based Multimedia Indexing, CBMI 2009. Proceedings : Chania, Greece, 3 - 5 June 2009
Piscataway, NJ: IEEE, 2009
ISBN: 978-1-424-44265-2
ISBN: 978-0-7695-3662-0
International Workshop on Content-Based Multimedia Indexing (CBMI) <7, 2009, Chania>
Fraunhofer IDMT ()
Tonsignal; Signalverarbeitung; Bildklassifikation; Musik; Merkmalextraktionsverfahren; cross-modal analysis; high-level semantics

This paper describes a technical solution for automated slideshow generation by extracting a set of high-level features from music, such as beat grid, mood and genre and intelligently combining this set with image high-level features, such as mood, daytime- and scene classification. An advantage of this high-level concept is to enable the user to incorporate his preferences regarding the semantic aspects of music and images. For example, the user might request the system to automatically create a slideshow, which plays soft music and shows pictures with sunsets from the last 10 years of his own photo collection.The high-level feature extraction on both, the audio and the visual information is based on the same underlying machine learning core, which processes different audio- and visual-low- and mid-level features. This paper describes the technical realization and evaluation of the algorithms with suitable test databases.