Publica
Hier finden Sie wissenschaftliche Publikationen aus den FraunhoferInstituten. Apparatus and method for harmonicpercussiveresidual sound separation using a structure tensor on spectrograms
 EP 3220386 A1: 20160318 

 English 
 Patent, Electronic Publication 
 Fraunhofer IIS () 
Abstract
Apparatus and method for analysing a magnitude spectrogram of an audio signal for HarmonicPercussive Residual Sound Separation HPSS comprising : Determining a change of a frequency for each timefrequency bin of a plurality of timefrequency bins of the magnitude spectrogram of the audio signal; classifying each timefrequency bin into a signal component group depending on the change of the frequency. A structural tensor is applied to the image of the spectogram for preprocessing or feature extraction by edge and corner detection, in particular by calculating predominant orientation angles in the spectrogram.The structure tensor can be considered a black box, where the input is a gray scale image and the outputs are angles n for each pixel corresponding to the direction of lowest change and a certainty or anisotropy measure for this direction for each pixel. A local frequency change is extracted from the angles : It can be determined, whether a timefrequencybin in the spectrogram belongs to a harmonic component (= low local frequency change) or to a percussive component (= high or infinite local frequency change). Examples of application : (figure 1) Distinguish between harmonic, percussive, and residual signal components by employing this orientation information. (figure 5) Analyse an audio signal for upmixing to five audio output channels front left, center, right, left surround and right surround :  The harmonic weighting factor may be greater for generating the left, center and right output channels compared to the harmonic weighting factor for generating the left surround and right surround output channels.  The percussive weighting factor may be smaller for generating the left, center and right output channels compared to the percussive weighting factor for generating the left surround and right surround output channels. (figure 6) Compute source separation metrics (source to distortion ratio SDR, source to interference ratio SIR, and source to artifacts ratios SAR) in a recorded audio signal. For example : A vibrato in a singing voice has a high instantaneous frequency change rate; an assignment of a bin in the spectrogram to "residual" is dependent on the bin anisotropy.