期刊文献+

Monaural noisy speech separation combining sparse non-negative matrix factorization and deep attractor network

原文传递
导出
摘要 The performance of the monaural speech separation method is limited when the speech mixture is disordered by background noise.To obtain the enhanced separated speech from the noisy mixture,a monaural noisy speech separation method combining sparse nonnegative matrix factorization(SNMF)and deep attractor network(DANet)is proposed.This method firstly decomposes the noisy mixture into coefficients of speech and noise respectively.Then the speech coefficient is projected to a high-dimensional embedding space and a DANet is trained to force the embeddings to move to different clusters.The attractor points are used to separate the speech coefficients by masking method,and finally the enhanced separated speeches are reconstructed by the speech basis and their corresponding coefficients.Experimental results in various background noise environments show that the proposed algorithm effectively suppress the noises without decreasing the quality of reconstructed speech by comparison with different baseline methods.
出处 《Chinese Journal of Acoustics》 CSCD 2021年第2期266-280,共15页 声学学报(英文版)
基金 supported by the National Natural Science Foundation of China(61671095,61702065,61701067,61771085) the Project of Key Laboratory of Signal and Information Processing of Chongqing(CSTC2009CA2003) Chongqing Graduate Research and Innovation Project(CYS17219) the Research Project of Chongqing Educational Commission(KJ1600427,KJ1600429)。
  • 相关文献

参考文献6

二级参考文献80

  • 1Kim G, Lu Y, Hu Y, Loizou P C. An algorithm that im- proves speech intelligibility in noise for normal-hearing lis- teners. The Journal of the Acoustical Society of America, 2009, 126(3): 1486-1494.
  • 2Dillon H. Hearing Aids. New York: Thieme, 2001.
  • 3Allen J B. Articulation and intelligibility. Synthesis Lectures on Speech and Audio Processing, 2005, 1(1): 1-124.
  • 4Seltzer M L, Raj B, Stern R M. A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition. Speech Communication, 2004, 43(4): 379-393.
  • 5Weninger F, Erdogan H, Watanabe S, Vincent E, Le Roux J, Hershey J R, Schuller B. Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR. In: Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation. Liberec, Czech Republic: Springer International Publishing, 2015.91 -99.
  • 6Weng C, Yu D, Seltzer M L, Droppo J. Deep neural networks for single-channel multi-talker speech recognition. IEEE/ ACM Transactions on Audio, Speech, and Language Pro- cessing, 2015, 23(10): 1670-1679.
  • 7Boll S F. Suppression of acoustic noise in speech using spec- tral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1979, 27(2): 113-120.
  • 8Chen J D, Benesty J, Huang Y T, Doclo S. New insights into the noise reduction wiener filter. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(4): 1218 -1234.
  • 9Loizou P C. Speech Enhancement: Theory and Practice. New York: CRC Press, 2007.
  • 10Liang S, Liu W J, Jiang W. A new Bayesian method incor- porating with local correlation for IBM estimation. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(3): 476-487.

共引文献109

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部