Measurement and Prediction of Binaural-Temporal Integration of Speech Reflections
For speech intelligibility in rooms, the temporal integration of speech reflections is typically modeled by separating the room impulse response (RIR) into an early (assumed beneficial for speech intelligibility) and a late part (assumed detrimental). This concept was challenged in this study by employing binaural RIRs with systematically varied interaural phase differences (IPDs) and amplitude of the direct sound and a variable number of reflections delayed by up to 200 ms. Speech recognition thresholds in stationary noise were measured in normal-hearing listeners for 86 conditions. The data showed that direct sound and one or several early speech reflections could be perfectly integrated when they had the same IPD. Early reflections with the same IPD as the noise (but not as the direct sound) could not be perfectly integrated with the direct sound. All conditions in which the dominant speech information was within the early RIR components could be well predicted by a binaural speech intelligibility model using classic early/late separation. In contrast, when amplitude or IPD favored late RIR components, listeners appeared to be capable of focusing on these components rather than on the precedent direct sound. This could not be modeled by an early/late separation window but required a temporal integration window that can be flexibly shifted along the RIR.