Options
2025
Conference Paper
Title
Cross-Talk Detection in the IVAS Stereo Codec Based on GCC-PHAT
Abstract
In real-time teleconferences over mobile networks, cross-talk can significantly impact the performance of parametric stereo codecs, particularly at low bitrates. When multiple speakers overlap, stereo quality can be improved by independently encoding the left and right channels, especially when inter-channel correlation is low. This approach can be implemented using the dual-mono EVS coder, for example. The recently standardized 3GPP IVAS codec incorporates both a parametric stereo model and an independent left/right stereo model based on the dual-mono EVS codec. To address cross-talk, the 3GPP IVAS codec features a cross-talk classifier that uses a multivariate statistical model based on GCC-PHAT and other spatial cues, allowing for seamless switching between the two stereo models. Listening tests show that the IVAS stereo codec enhances performance in single-talker scenarios while maintaining quality in cross-talk segments.
Author(s)