Lin Zhou1,*, Kun Feng1, Tianyi Wang1, Yue Xu1, Jingang Shi2
Intelligent Automation & Soft Computing, Vol.30, No.2, pp. 527-537, 2021, DOI:10.32604/iasc.2021.018414
- 11 August 2021
Abstract Neutral network (NN) and clustering are the two commonly used methods for speech separation based on supervised learning. Recently, deep clustering methods have shown promising performance. In our study, considering that the spectrum of the sound source has time correlation, and the spatial position of the sound source has short-term stability, we combine the spectral and spatial features for deep clustering. In this work, the logarithmic amplitude spectrum (LPS) and the interaural phase difference (IPD) function of each time frequency (TF) unit for the binaural speech signal are extracted as feature. Then, these features of… More >