• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Multilingual Query-by-Example Keyword Spotting with Metric Learning and Phoneme-to-Embedding Mapping
 
  • Details
  • Full
Options
2023
Conference Paper
Title

Multilingual Query-by-Example Keyword Spotting with Metric Learning and Phoneme-to-Embedding Mapping

Abstract
In this paper, we propose a multilingual query-by-example keyword spotting (KWS) system based on a residual neural network. The model is trained as a classifier on a multilingual keyword dataset extracted from Common Voice sentences and fine-tuned using circle loss. We demonstrate the generalization ability of the model to new languages and report a mean reduction in EER of 59.2% for previously seen and 47.9% for unseen languages compared to a competitive baseline. We show that the word embeddings learned by the KWS model can be accurately predicted from the phoneme sequences using a simple LSTM model. Our system achieves a promising accuracy for streaming keyword spotting and keyword search on Common Voice audio using just 5 examples per keyword. Experiments on the Hey-Snips dataset show a good performance with a false negative rate of 5.4% at only 0.1 false alarms per hour.
Author(s)
Reuter, Paul Maria  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Rollwage, Christian  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Meyer, Bernd T.
Mainwork
ICASSP 2023, IEEE International Conference on Acoustics, Speech and Signal Processing. Proceedings  
Conference
International Conference on Acoustics, Speech, and Signal Processing 2023  
DOI
10.1109/ICASSP49357.2023.10095400
Language
English
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024