Using a blind EC mechanism for modelling the interaction between binaural and temporal speech processing

Röttges, Saskia; Hauth, Christopher F.; Rennies, Jan; Brand, Thomas

doi:10.1051/aacus/2022009

2022

Journal Article

Abstract

We reanalyzed a study that investigated binaural and temporal integration of speech reflections with different amplitudes, delays, and interaural phase differences. We used a blind binaural speech intelligibility model (bBSIM), applying an equalization-cancellation process for modeling binaural release from masking. bBSIM is blind, as it requires only the mixed binaural speech and noise signals and no auxiliary information about the listening conditions. bBSIM was combined with two non-blind back-ends: The speech intelligibility index (SII) and the speech transmission index (STI) resulting in hybrid-models. Furthermore, bBSIM was combined with the non-intrusive short-time objective intelligibility (NI-STOI) resulting in a fully blind model. The fully non-blind reference model used in the previous study achieved the best prediction accuracy (R2 = 0.91 and RMSE = 1 dB). The fully blind model yielded a coefficient of determination (R2 = 0.87) similar to that of the reference model but also the highest root mean square error of the models tested in this study (RMSE = 4.4 dB). By adjusting the binaural processing errors of bBSIM as done in the reference model, the RMSE could be decreased to 1.9 dB. Furthermore, in this study, the dynamic range of the SII had to be adjusted to predict the low SRTs of the speech material used.

Author(s)

Röttges, Saskia

Hauth, Christopher F.

Rennies, Jan

Fraunhofer-Institut für Digitale Medientechnologie IDMT

Brand, Thomas

Journal

Acta Acustica

Options

Using a blind EC mechanism for modelling the interaction between binaural and temporal speech processing