Options
2025
Conference Paper
Title
Coding Higher Order Ambisonics in 3GPP IVAS - Scaling Parametric Audio Coding to Higher Bitrates
Abstract
Immersive Voice and Audio Services (IVAS) is the immersive audio communications codec for 5G networks recently standardized by 3GPP (3rd Generation Partnership Project). It supports multiple immersive audio formats, including higher-order Ambisonics (HOA). The latter is known to be particularly challenging to compress given the bitrate and complexity limits in a mobile-communications scenario. Directional Audio Coding (DirAC) has been shown to be an effective parameterization of first-order Ambisonics and has been adopted for low-to-medium bitrates in IVAS. Higher-order parametrization can scale the codec to higher quality, overcoming known challenges with the parameterization from first-order Ambisonics. These challenges are addressed by combining the strengths of the parametric higher-order Directional Audio Coding (HO-DirAC) and Spatial Reconstruction (SPAR). Scaling to higher bitrates is achieved by a hybrid parameterization scheme and a sector-based higher-order DirAC architecture specifically adapted to IVAS. The latter estimates two directions of arrival (DoAs) but also a single global diffuseness, facilitating a deep integration into the IVAS codec and bitrate switching. Here we detail the Ambisonics coding in IVAS and its extensions at high bitrates. We show that the proposed system improves the perceptual quality for challenging audio scenes while keeping the complexity within the required limits.
Author(s)
Conference