Open Access
ARTICLE
Binaural Speech Separation Algorithm Based on Long and Short Time Memory Networks
Lin Zhou1, *, Siyuan Lu1, Qiuyue Zhong1, Ying Chen1, 2, Yibin Tang3, Yan Zhou3
1 School of Information Science and Engineering, Southeast University, Nanjing, 210096, China.
2 Department of Psychiatry, Columbia University and NYSPI, New York, 10032, USA.
3 College of Internet of Things Engineering, Hohai University, Changzhou, 213022, China.
* Corresponding Author: Lin Zhou. Email: .
Computers, Materials & Continua 2020, 63(3), 1373-1386. https://doi.org/10.32604/cmc.2020.010182
Received 15 February 2020; Accepted 28 February 2020; Issue published 30 April 2020
Abstract
Speaker separation in complex acoustic environment is one of challenging
tasks in speech separation. In practice, speakers are very often unmoving or moving
slowly in normal communication. In this case, the spatial features among the consecutive
speech frames become highly correlated such that it is helpful for speaker separation by
providing additional spatial information. To fully exploit this information, we design a
separation system on Recurrent Neural Network (RNN) with long short-term memory
(LSTM) which effectively learns the temporal dynamics of spatial features. In detail, a
LSTM-based speaker separation algorithm is proposed to extract the spatial features in
each time-frequency (TF) unit and form the corresponding feature vector. Then, we treat
speaker separation as a supervised learning problem, where a modified ideal ratio mask
(IRM) is defined as the training function during LSTM learning. Simulations show that
the proposed system achieves attractive separation performance in noisy and reverberant
environments. Specifically, during the untrained acoustic test with limited priors, e.g.,
unmatched signal to noise ratio (SNR) and reverberation, the proposed LSTM based
algorithm can still outperforms the existing DNN based method in the measures of PESQ
and STOI. It indicates our method is more robust in untrained conditions.
Keywords
Cite This Article
L. Zhou, S. Lu, Q. Zhong, Y. Chen, Y. Tang
et al., "Binaural speech separation algorithm based on long and short time memory networks,"
Computers, Materials & Continua, vol. 63, no.3, pp. 1373–1386, 2020.
Citations