• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Abschlussarbeit
  4. Detecting double-talk (overlapping speech) in conversations using deep learning
 
  • Details
  • Full
Options
2017
Master Thesis
Title

Detecting double-talk (overlapping speech) in conversations using deep learning

Abstract
The work presented in this thesis aims to automatically detect double-talks (overlapping speech) in audio recordings of natural conversations using a Deep Convolutional Neural Network. In doing it so, manual engineering of problem specific acoustic features prevelant in classical approaches is avoided. The characteristic challenges arising from the ephemeral nature of natural double-talks, in addition to the standard issues faced in development of a pattern recognition system, are handled using different methods. In particular, careful rebalancing of the training data for tackling the inherent class imbalance, pre-removal of silence, and two standard normalization procedures for reducing the mismatch in training and testing conditions, are all scientifically evaluated for their respective impacts. Furthermore, the shortcoming of the proposed neural network in modelling long-term temporal dependencies is documented, and the attempt for fixing it with Viterbi decoding is reported. Satisfactory results have been achieved on a large and representative testing set, while multiple avenues have been paved for future works.
Thesis Note
Aachen, TH, Master Thesis, 2017
Author(s)
Abdullah
Publishing Place
Aachen
Project(s)
KA3
Funder
Bundesministerium für Bildung und Forschung BMBF (Deutschland)  
DOI
10.24406/publica-fhg-281843
File(s)
N-477004.pdf (1.49 MB)
Rights
Under Copyright
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024