Monaural noisy speech separation combining sparse non-negative matrix factorization and deep attractor network

导出

摘要 The performance of the monaural speech separation method is limited when the speech mixture is disordered by background noise.To obtain the enhanced separated speech from the noisy mixture,a monaural noisy speech separation method combining sparse nonnegative matrix factorization(SNMF)and deep attractor network(DANet)is proposed.This method firstly decomposes the noisy mixture into coefficients of speech and noise respectively.Then the speech coefficient is projected to a high-dimensional embedding space and a DANet is trained to force the embeddings to move to different clusters.The attractor points are used to separate the speech coefficients by masking method,and finally the enhanced separated speeches are reconstructed by the speech basis and their corresponding coefficients.Experimental results in various background noise environments show that the proposed algorithm effectively suppress the noises without decreasing the quality of reconstructed speech by comparison with different baseline methods.

作者 GE Wanying ZHANG Tianqi FAN Congcong ZHANG Tian

机构地区 School of Communication and Information Engineering

出处《Chinese Journal of Acoustics》 CSCD 2021年第2期266-280,共15页 声学学报（英文版）

基金 supported by the National Natural Science Foundation of China(61671095,61702065,61701067,61771085) the Project of Key Laboratory of Signal and Information Processing of Chongqing(CSTC2009CA2003) Chongqing Graduate Research and Innovation Project(CYS17219) the Research Project of Chongqing Educational Commission(KJ1600427,KJ1600429)。

关键词 ATTRACTOR SEPARATION FACTORIZATION

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献6

1时文华,倪永婧,张雄伟,邹霞,孙蒙,闵刚.联合稀疏非负矩阵分解和神经网络的语音增强[J].计算机研究与发展,2018,55(11):2430-2438. 被引量：9
2黄雅婷,石晶,许家铭,徐波.鸡尾酒会问题与相关听觉模型的研究现状与展望[J].自动化学报,2019,45(2):234-251. 被引量：24
3李煦,王子腾,王晓飞,付强,颜永红.采用性别相关的深度神经网络及非负矩阵分解模型用于单通道语音增强[J].声学学报,2019,44(2):221-230. 被引量：14
4董兴磊,胡英,黄浩,吾守尔·斯拉木.基于卷积非负矩阵部分联合分解的强噪声单声道语音分离[J].自动化学报,2020,46(6):1200-1209. 被引量：3
5路成,田猛,周健,王华彬,陶亮.L_(1/2)稀疏约束卷积非负矩阵分解的单通道语音增强方法[J].声学学报,2017,42(3):377-384. 被引量：10
6刘文举,聂帅,梁山,张学良.基于深度学习语音分离技术的研究现状与进展[J].自动化学报,2016,42(6):819-833. 被引量：72

二级参考文献80

1Kim G, Lu Y, Hu Y, Loizou P C. An algorithm that im- proves speech intelligibility in noise for normal-hearing lis- teners. The Journal of the Acoustical Society of America, 2009, 126(3): 1486-1494.
2Dillon H. Hearing Aids. New York: Thieme, 2001.
3Allen J B. Articulation and intelligibility. Synthesis Lectures on Speech and Audio Processing, 2005, 1(1): 1-124.
4Seltzer M L, Raj B, Stern R M. A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition. Speech Communication, 2004, 43(4): 379-393.
5Weninger F, Erdogan H, Watanabe S, Vincent E, Le Roux J, Hershey J R, Schuller B. Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR. In: Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation. Liberec, Czech Republic: Springer International Publishing, 2015.91 -99.
6Weng C, Yu D, Seltzer M L, Droppo J. Deep neural networks for single-channel multi-talker speech recognition. IEEE/ ACM Transactions on Audio, Speech, and Language Pro- cessing, 2015, 23(10): 1670-1679.
7Boll S F. Suppression of acoustic noise in speech using spec- tral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1979, 27(2): 113-120.
8Chen J D, Benesty J, Huang Y T, Doclo S. New insights into the noise reduction wiener filter. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(4): 1218 -1234.
9Loizou P C. Speech Enhancement: Theory and Practice. New York: CRC Press, 2007.
10Liang S, Liu W J, Jiang W. A new Bayesian method incor- porating with local correlation for IBM estimation. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(3): 476-487.

共引文献109

1李艳生,刘园,张毅,杨美美.混响环境下移动机器人语音控制方法及系统实现[J].仪器仪表学报,2019,40(11):165-171. 被引量：15
2杨海龙,曾祥福,钟维良.多尺度时域单通道语音分离网络设计[J].电声技术,2021,45(10):96-99.
3黄张翼,周翊,舒晓峰,刘宏清.联合贝叶斯估计与深度神经网络的语音增强方法[J].小型微型计算机系统,2019,40(1):40-44. 被引量：5
4吕菲,夏秀渝.基于方位特征的听觉选择性注意计算模型研究[J].自动化学报,2017,43(4):634-644. 被引量：5
5支艳利,张云伟.基于环形麦克风阵列的远场语音识别系统[J].微型电脑应用,2017,33(4):62-64. 被引量：2
6王程,周婉,何军.面向自动音乐生成的深度递归神经网络方法[J].小型微型计算机系统,2017,38(10):2412-2416. 被引量：14
7袁文浩,孙文珠,夏斌,欧世峰.利用深度卷积神经网络提高未知噪声下的语音增强性能[J].自动化学报,2018,44(4):751-759. 被引量：39
8周健,刘荣敏,窦云峰,路成,陶亮.采用L1/2稀疏约束的梅尔倒谱系数语音重建方法[J].声学学报,2018,43(6):991-999. 被引量：6
9凌佳佳,袁晓兵.联合噪声分类和掩码估计的语音增强方法[J].电子设计工程,2018,26(17):30-34. 被引量：3
10袁文浩,梁春燕,夏斌,孙文珠.一种融合相位估计的深度卷积神经网络语音增强方法[J].电子学报,2018,46(10):2359-2366. 被引量：7

1张术昌,袁梓洋,王红霞,陈波.面向组织病理学图像的颜色迁移算法[J].计算机辅助设计与图形学学报,2020,32(12):1890-1897. 被引量：4
2ZHANG Shilei,JIAN Zhihua,SUN Minhong,ZHONG Hua,LIU Erxiao.Noise-robust voice conversion based on joint dictionary optimization[J].Chinese Journal of Acoustics,2020,39(2):259-272.
3Jian-Wei Qiu,Xiang-Peng Wang,Hongxi Xing.Exploring J/ψ Production Mechanism at the Future Electron-Ion Collider[J].Chinese Physics Letters,2021,38(4):22-27.
4沈思,李沁宇,叶媛,孙豪,叶文豪.基于TWE模型的医学科技报告主题挖掘及演化分析研究[J].数据分析与知识发现,2021,5(3):35-44. 被引量：11
5D.E.Falebita,O.Afolabi,B.O Soyinka,A.A.Adepelumi.Characterization of the Sulfide Deposits in the Southeastern Nigeria Using VLF Method: Insights from Numerical Modeling and Field Examples[J].Journal of Geological Research,2021,3(1):39-49.
6Musthaq AHAMED,P.D.S.H.GUNAWARDANE,Nimali T.MEDAGEDARA.Early Identification and Visualization of Parkinsonian Gaits and their Stages Using Convolution Neural Networks and Finite Element Techniques[J].Instrumentation,2020,7(3):33-42. 被引量：1
7Huikang Huang,Haozhen Situ,Shenggen Zheng.Bidirectional Information Flow Quantum State Tomography[J].Chinese Physics Letters,2021,38(4):12-16.
8Yexun Shi,Chang Li,Liming Shen,Ningzhong Bao.Structure-dependent re-dispersibility of graphene oxide powders prepared by fast spray drying[J].Chinese Journal of Chemical Engineering,2021,34(4):485-492. 被引量：1
9韩光辉,韩守亮,李高鹏,郑维,纪秉男,张涛.纯电动车用驱动电机滚动轴承状态监测方法[J].电子测量与仪器学报,2021,35(2):130-135. 被引量：1
10WANG Hongbo,YANG Fan,TIAN Kena,TU Xuyan.A Many-Objective Evolutionary Algorithm with Spatial Division and Angle Culling Strategy[J].Chinese Journal of Electronics,2021,30(3):437-443.

Chinese Journal of Acoustics

2021年第2期

浏览历史

内容加载中请稍等...

Monaural noisy speech separation combining sparse non-negative matrix factorization and deep attractor network

参考文献6

二级参考文献80

共引文献109

相关作者

相关机构

相关主题

浏览历史