Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Predicting the quality of processed speech by combining modulation-based features and model trees

: Cauchi, B.; Goetze, S.; Naylor, P.A.; Doclo, S.

Informationstechnische Gesellschaft -ITG-; Informationstechnische Gesellschaft -ITG-, Fachausschuss Sprachakustik:
Speech Communication. 12. ITG-Fachtagung Sprachkommunikation 2016 : 5. - 7. Oktober 2016 in Paderborn, CD-ROM
Berlin: VDE-Verlag, 2016 (ITG-Fachbericht 267)
ISBN: 3-8007-4275-6
ISBN: 978-3-8007-4275-2
5 S.
Fachtagung Sprachkommunikation <12, 2016, Paderborn>
Fraunhofer IDMT ()

Many signal processing methods have been proposed to improve the quality of speech recorded in the presence of noise and reverberation. The evaluation of these methods either requires the use of perceptual measures, i.e. listening tests, or instrumental measures. Perceptual measures are typically more reliable but are quite costly and timeconsuming. On the other hand, instrumental measures may correlate poorly with the perceived speech quality. In this paper we propose to train an instrumental measure, combining modulation-based features and model trees, on the basis of perceptual scores obtained on a small corpus of speech data that has been processed by a combination of beamforming and spectral postfiltering. For evaluation purposes the resulting measure is then applied to a larger corpus. Results show that the use of model trees to train the predicting function of an instrumental measure increases its correlation with perceptual scores.