• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Examining Speaker and Keyword Uniqueness: Partitioning Keyword Spotting Datasets for Federated Learning with the Largest Differencing Method
 
  • Details
  • Full
Options
August 4, 2023
Conference Paper
Title

Examining Speaker and Keyword Uniqueness: Partitioning Keyword Spotting Datasets for Federated Learning with the Largest Differencing Method

Abstract
Federated learning is a powerful training strategy for neural networks where several independent clients train a model without the need of sharing potentially sensitive data. However, real world client-local data is usually biased: A single client might have access to only a few lighting conditions in computer visions, patient groups in a hospital or speakers and keywords in a smart device performing keyword spotting. We help researchers to better understand and estimate the expected performance impacts by introducing a new method to partition a given dataset into an arbitrary amount of clients, each with unique properties, to simulate such conditions.
We apply the method to partition the Google Speech Command dataset into clients with non-overlapping speakers and additionally unique keywords and share the script to create the novel GSC-FL dataset. The results, using convolutional neural networks, show that the performance of the final model is stable up to at least 16 clients and models trained only on local data are clearly outperformed by federated learning. However, unique speakers for each client have a negative performance impact and it increases even more with unique keywords. Our script can be applied with only minor adjustments to partition any other dataset for federated learning investigations as well.
Author(s)
Wallbott, Paul
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Grollmisch, Sascha  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Köllmer, Thomas  
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Mainwork
Artificial Intelligence and Machine Learning. 34th Joint Benelux Conference, BNAIC/Benelearn 2022  
Conference
BeNeLux Conference on AI 2022  
Belgian-Dutch Conference on Machine Learning 2022  
DOI
10.1007/978-3-031-39144-6_11
Language
English
Fraunhofer-Institut für Digitale Medientechnologie IDMT  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Keyword(s)
  • speech recognition

  • keyword spotting

  • federated learning

  • deep learning

  • multiway number partitioning

  • Trustworthy AI

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024