Options
2012
Conference Paper
Titel
Perceptual hashing for the identification of telephone speech
Abstract
The hashing of audio content for the identification of specific recordings and their degradations has many applications. In particular music identification is well established. In this paper, the perceptual hashing of speech is investigated and applied to the content-based identification of telephone spam. Based on well-known audio fingerprinting methods, various modifications and extensions have been developed and compared. We explore index-based search methods in order to match sequences of feature vectors. We investigate the influence of the hash size on the recognition rate and in particular the search efficiency in a large and and constantly updated fingerprint database like in a telephone speech scenario. It is shown that two 32-bit hashes with a unique time-distance allow for an efficient identification of telephone speech within a large call database.
Konferenz