Table of Content

Open Access iconOpen Access

ARTICLE

crossmark

Sound Source Localization Based on SRP-PHAT Spatial Spectrum and Deep Neural Network

Xiaoyan Zhao1, *, Shuwen Chen2, Lin Zhou3, Ying Chen3, 4

1 School of Information and Communication Engineering, Nanjing Institute of Technology, Nanjing, 211167, China.
2 School of Mathematics and Information Technology, Jiangsu Second Normal University, Nanjing, 210013, China.
3 School of Information Science and Engineering, Southeast University, Nanjing, 210096, China.
4 Department of Psychiatry, Columbia University and NYSPI, New York, 10032, USA.

* Corresponding Author: Xiaoyan Zhao. Email: email.

Computers, Materials & Continua 2020, 64(1), 253-271. https://doi.org/10.32604/cmc.2020.09848

Abstract

Microphone array-based sound source localization (SSL) is a challenging task in adverse acoustic scenarios. To address this, a novel SSL algorithm based on deep neural network (DNN) using steered response power-phase transform (SRP-PHAT) spatial spectrum as input feature is presented in this paper. Since the SRP-PHAT spatial power spectrum contains spatial location information, it is adopted as the input feature for sound source localization. DNN is exploited to extract the efficient location information from SRP-PHAT spatial power spectrum due to its advantage on extracting high-level features. SRP-PHAT at each steering position within a frame is arranged into a vector, which is treated as DNN input. A DNN model which can map the SRP-PHAT spatial spectrum to the azimuth of sound source is learned from the training signals. The azimuth of sound source is estimated through trained DNN model from the testing signals. Experiment results demonstrate that the proposed algorithm significantly improves localization performance whether the training and testing condition setup are the same or not, and is more robust to noise and reverberation.

Keywords


Cite This Article

X. Zhao, S. Chen, L. Zhou and Y. Chen, "Sound source localization based on srp-phat spatial spectrum and deep neural network," Computers, Materials & Continua, vol. 64, no.1, pp. 253–271, 2020. https://doi.org/10.32604/cmc.2020.09848

Citations




cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2876

    View

  • 1394

    Download

  • 0

    Like

Share Link