• English
  • Deutsch
  • Log In
    Password Login
    or
  • Research Outputs
  • Projects
  • Researchers
  • Institutes
  • Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Prediction of perceived speech quality using deep machine listening
 
  • Details
  • Full
Options
2018
Conference Paper
Titel

Prediction of perceived speech quality using deep machine listening

Abstract
Subjective ratings of speech quality (SQ) are essential for evaluating algorithms for speech transmission and enhancement. In this paper we explore a non-intrusive model for SQ prediction based on the output of a deep neural net (DNN) from a regular automatic speech recognizer. The degradation of phoneme probabilities obtained from the net is quantified with the mean temporal distance proposed earlier for multi-stream ASR. The SQ predicted with this method is compared with average subject ratings from the TCD-VoIP speech quality database that covers several effects of SQ degradation that can occur in VoIP applications such as clipping, packet loss, echo effects, background noise and competing speakers. Our approach is tailored to speech and therefore not applicable when quality is degraded by a competing speaker, which is reflected by an insignificant correlation between model output and subjective SQ. In all other conditions mentioned above, the model reaches an average correlation of r=0.87, which is higher than the correlation achieved with the baseline ITU-T P.563 (r=0.71) and the American National Standard ANIQUE+ (r=0.75). Since the most robust ASR system is not necessarily the best model to predict SQ, we investigate the effect of the amount of training data on quality prediction.
Author(s)
Ooster, J.
Huber, R.
Meyer, B.T.
Hauptwerk
Interspeech 2018. Online resource
Konferenz
International Speech Communication Association (Interspeech Annual Conference) 2018
Thumbnail Image
DOI
10.21437/Interspeech.2018-1374
Externer Link
Externer Link
Language
English
google-scholar
Fraunhofer-Institut für Digitale Medientechnologie IDMT
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Send Feedback
© 2022