Robust Sound Source Localization Using Convolutional Neural Network Based on Microphone Array

Zhao, Xiaoyan; Zhou, Lin; Tong, Ying; Qi, Yuxiao; Shi, Jingang

doi:10.32604/iasc.2021.018823

Open Access icon Open Access

ARTICLE

Robust Sound Source Localization Using Convolutional Neural Network Based on Microphone Array

by Xiaoyan Zhao^1,*, Lin Zhou², Ying Tong¹, Yuxiao Qi¹, Jingang Shi³

1 School of Information and Communication Engineering, Nanjing Institute of Technology, Nanjing, 211167, China
2 School of Information Science and Engineering, Southeast University, Nanjing, 210096, China
3 University of Oulu, Oulu, 900014, FI, Finland

* Corresponding Author: Xiaoyan Zhao. Email: email

Intelligent Automation & Soft Computing 2021, 30(1), 361-371. https://doi.org/10.32604/iasc.2021.018823

Received 22 March 2021; Accepted 23 April 2021; Issue published 26 July 2021

Abstract

In order to improve the performance of microphone array-based sound source localization (SSL), a robust SSL algorithm using convolutional neural network (CNN) is proposed in this paper. The Gammatone sub-band steered response power-phase transform (SRP-PHAT) spatial spectrum is adopted as the localization cue due to its feature correlation of consecutive sub-bands. Since CNN has the “weight sharing” characteristics and the advantage of processing tensor data, it is adopted to extract spatial location information from the localization cues. The Gammatone sub-band SRP-PHAT spatial spectrum are calculated through the microphone signals decomposed in frequency domain by Gammatone filters bank. The proposed algorithm takes a two-dimensional feature matrix which is assembled from Gammatone sub-band SRP-PHAT spatial spectrum within a frame as CNN input. Taking the advantage of powerful modeling capability of CNN, the two-dimensional feature matrices in diverse environments are used together to train the CNN model which reflects mapping regularity between the feature matrix and the azimuth of sound source. The estimated azimuth of the testing signal is predicted through the trained CNN model. Experimental results show the superiority of the proposed algorithm in SSL problem, it achieves significantly improved localization performance and capacity of robustness and generality in various acoustic environments.

Keywords

Microphone array; sound source localization; convolutional neural network; gammatone sub-band steered response power-phase transform spatial spectrum

Cite This Article

APA Style

Zhao, X., Zhou, L., Tong, Y., Qi, Y., Shi, J. (2021). Robust sound source localization using convolutional neural network based on microphone array. Intelligent Automation & Soft Computing, 30(1), 361-371. https://doi.org/10.32604/iasc.2021.018823

Vancouver Style

Zhao X, Zhou L, Tong Y, Qi Y, Shi J. Robust sound source localization using convolutional neural network based on microphone array. Intell Automat Soft Comput . 2021;30(1):361-371 https://doi.org/10.32604/iasc.2021.018823

IEEE Style

X. Zhao, L. Zhou, Y. Tong, Y. Qi, and J. Shi, “Robust Sound Source Localization Using Convolutional Neural Network Based on Microphone Array,” Intell. Automat. Soft Comput. , vol. 30, no. 1, pp. 361-371, 2021. https://doi.org/10.32604/iasc.2021.018823

BibTex EndNote RIS

Copyright © 2021 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Robust Sound Source Localization Using Convolutional Neural Network Based on Microphone Array

Abstract

Keywords

Cite This Article

1777

1197

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link