Publica
Hier finden Sie wissenschaftliche Publikationen aus den FraunhoferInstituten. Multiscale aggregation of phase information for complexity reduction of CNN based DOA estimation
 Bugallo, Mónica F. (General Chair) ; Institute of Electrical and Electronics Engineers IEEE; European Association for Signal Processing EURASIP: 27th European Signal Processing Conference, EUSIPCO 2019 : A Coruña, Spain, September 26, 2019 Piscataway, NJ: IEEE, 2019 ISBN: 9789082797039 ISBN: 9789082797022 ISBN: 9781538673003 S.13531357 
 European Signal Processing Conference (EUSIPCO) <27, 2019, A Coruña/Spain> 

 Englisch 
 Konferenzbeitrag 
 Fraunhofer IIS () 
Abstract
In a recent work on directionofarrival (DOA) estimation of multiple speakers with convolutional neural networks (CNNs), the phase component of shorttime Fourier transform (STFT) coefficients of the microphone signal is given as input and small filters are used to learn the phase relations between neighboring microphones. Due to the chosen filter size, M − 1 convolution layers are required to achieve the best performance for a microphone array with M microphones. For arrays with large number of microphones, this requirement leads to a high computational cost making the method practically infeasible. In this work, we propose to expand the receptive field of the filters to reduce the computational cost of our previously proposed method. To realize this expansion, we use systematic dilations of the filters in each of the convolution layers. Different systematic dilation strategies for a specific microphone array are explored. Experimental analysis of the different strategies, shows that an aggressive expansion strategy results in a considerable reduction in computational cost while a relatively gradual expansion of the receptive field exhibits the best DOA estimation performance along with reduction in the computational cost.