Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Entropy coding of spectral envelopes for speech and audio coding using distribution quantization

: Korse, Srikanth; Jähnel, Tobias; Bäckström, Tom

International Speech Communication Association -ISCA-:
Understanding speech processing in humans and machines. Vol.4 : 17th Annual Conference of the International Speech Communication Association (INTERSPEECH 2016); San Francisco, California, USA, 8-12 September 2016
Red Hook, NY: Curran, 2016
ISBN: 978-1-5108-3313-5
International Speech Communication Association (Interspeech Annual Conference) <17, 2016, San Francisco/Calif.>
Fraunhofer IIS ()

Speech and audio codecs model the overall shape of the signal spectrum using envelope models. In speech coding the predominant approach is linear predictive coding, which offers high coding efficiency at the cost of computational complexity and a rigid systems design. Audio codecs are usually based on scale factor bands, whose calculation and coding is simple, but whose coding efficiency is lower than that of linear prediction. In the current work we propose an entropy coding approach for scale factor bands, with the objective of reaching the same coding efficiency as linear prediction, but simultaneously retaining a low computational complexity. The proposed method is based on quantizing the distribution of spectral mass using beta distributions. Our experiments show that the perceptual quality achieved with the proposed method is similar to that of linear predictive models with the same bit rate, while the design simultaneously allows variable bit-rate coding and can easily be scaled to different sampling rates. The algorithmic complexity of the proposed method is less than one third of traditional multi-stage vector quantization of linear predictive envelopes.