Multi-modal and multi-camera attention in smart environments

Schauerte, B.; Richarz, J.; Plötz, T.; Thurau, Christian; Fink, G.A.

doi:10.1145/1647314.1647370

2009

Conference Paper

Abstract

This paper considers the problem of multi-modal saliency and attention. Saliency is a cue that is often used for directing attention of a computer vision system, e.g., in smart environments or for robots. Unlike the majority of recent publications on visual/audio saliency, we aim at a well grounded integration of several modalities. The proposed framework is based on fuzzy aggregations and offers a flexible, plausible, and efficient way for combining multi-modal saliency information. Besides incorporating different modalities, we extend classical 2D saliency maps to multi-camera and multi-modal 3D saliency spaces. For experimental validation we realized the proposed system within a smart environment. The evaluation took place for a demanding setup under real-life conditions, including focus of attention selection for multiple subjects and concurrently active modalities.

Author(s)

Schauerte, B.

Richarz, J.

Plötz, T.

Thurau, Christian

Fink, G.A.

Mainwork

ICMI-MLMI 2009, International Conference on Multimodal Interfaces & the Workshop on Machine Learning for Multimodal Interfaces. Proceedings

Conference

Workshop on Machine Learning for Multimodal Interfaces (MLMI) 2009

Options

Multi-modal and multi-camera attention in smart environments