Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Face retrieval on large-scale video data

: Herrmann, C.; Beyerer, Jürgen

Postprint urn:nbn:de:0011-n-3562450 (3.0 MByte PDF)
MD5 Fingerprint: ff79362469c782485c4883e2b63716c5
© IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Created on: 20.8.2015

Guerrero, J.E. ; Institute of Electrical and Electronics Engineers -IEEE-; IEEE Computer Society:
12th Conference on Computer and Robot Vision, CRV 2015. Proceedings : 3-5 June 2015, Halifax, Nova Scotia, Canada
Los Alamitos, Calif.: IEEE Computer Society Conference Publishing Services (CPS), 2015
ISBN: 978-1-4799-1986-4
Conference on Computer and Robot Vision (CRV) <12, 2015, Halifax/Nova Scotia>
Conference Paper, Electronic Publication
Fraunhofer IOSB ()
face recognition; video retrieval; large-scale; fisher vector; bag of words

Increasingly large amounts of video data raise the question if large-scale face retrieval is feasible. To find fast and accurate matching strategies, an according face track descriptor is constructed by using local features, extended by an encoding of the respective measurement conditions. The feature encoding allows collecting all features of one face track together in a single feature set, where cumulative descriptors, known from image or object retrieval applications, especially bag of words and fisher vectors, can be applied. These descriptors are known to be viable for large-scale retrieval applications. To explore large-scale video face retrieval, we first evaluate on the largest available public datasets, i.e. You Tube Faces Database (YTF) and Face in Action Database (FiA). Finally, the behaviour of face retrieval for increasing amounts of data is investigated by combining these datasets with 55K face tracks, collected from about 100 hours of TV data, making it the largest collection of face tracks we are aware of.