期刊文献+

基于SVR的情感语音变换 被引量:2

Emotional speech conversion based on support vector regression
在线阅读 下载PDF
导出
摘要 提出了一种新的基于支持向量回归(SVR)的情感语音的变换方法.通过提取普通话10种情感语音的韵律特征,对比分析了中性语音和情感语音之间的韵律特征差异,利用SVR建立了基频、时长、能量、停顿等韵律特征参数的预测模型,并利用Straight算法实现了由中性语音向情感语音的转换.利用这种方法变换出的10种情感语音,其情感主观平均(EMOS)得分为3.4. This paper proposed a novel approach for emotional speech conversion based on support vector regression(SVR). By analyzing the prosodic features of contrastive neutral and emotional recordings, a support vector regression(SVR) based model is developed, which can transform acoustic features of neutral speech(pitch, duration, energy and pauses) to resemble emotional speech with Straigth algorithm. Emotional mean opinion score(EMOS) results demonstrate that the modified speech which achieved 3.4 of score can express emotion.
出处 《西北师范大学学报(自然科学版)》 CAS 北大核心 2009年第1期62-66,93,共6页 Journal of Northwest Normal University(Natural Science)
基金 教育部科学研究重点项目(208146) 西北师范大学科研骨干培育项目(NWNU-KJCXGC-03-42)
关键词 SVR 情感语音变换 Straight算法 SVR emotional speech conversion Straight algorithm
  • 相关文献

参考文献10

  • 1LIDA A, CAMPBELL N, HIGUCHI F, et al. A corpus based speech synthesis system with emotion [J]. Speech Communication, 2003, 40 (1): 161- 187.
  • 2TODA T, BLACK A W, TOKUDA K. Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter [J]. Acoustics, Speech, and Signal Processing, 2005, 30(1): 9-12.
  • 3彭柏,许刚.利用频谱搬移控制语音转换中的共振峰[J].电声技术,2007,31(1):39-43. 被引量:2
  • 4TAOJ H, KANG Y G, LI A J. Prosody conversion from neutral speech to emotional speech [J]. IEEE Transactions on Audio, Speech Language Processing, 2006, 14(4): 1145-1154.
  • 5NI J, HIRAI T, KAWAI H. Constructing phonetic-rich speech corpus while controlling time- dependent voice quality variability for english speech synthesis [C]//ICASSP 2006 Proceedings, IEEE International Conference 2006: 881-884.
  • 6BUSSO C, YILDIRIM S, KAZEMZADEH A. Investigating the role of phoneme-level modifications in emotional speech re-synthesis[C]//Proceedings of the EUROSPEECH, Interspeech, Lisbon Portugal, 2005 : 801-804.
  • 7KAWAHARA H. STRAIGHT, exploration of the other aspect of VOCODER: perceptually isomorphic decomposition of speech sounds[J]. Acoustic Science and Technology, 2006, 27(6): 349-353.
  • 8蒋丹宁,蔡莲红.基于韵律特征的汉语情感语音分类[C]//第一届中国情感计算及智能交互学术会议论文集.北京:中国科学院自动化研究所,2003:122-124.
  • 9李爱军.汉语情感语音研究[C/OL](2006-11-08)[2007-03-05]http://www.corpus4u.org/archive/index.php/f一80.html.
  • 10CRISTlANINI N,SHAWE-TAYLOR J.支持向量机导论[M].李国正,王猛,曾华军,译.北京:电子工业出版社,2004.

二级参考文献12

  • 1CHILDERS D G,WU K.Quality of speech produced by analysis-synthesis[J].Speech Communication,1990,9:97-117.
  • 2吴宗济.普通话单音节语图册[M].北京:中国社会科学出版社,1986.
  • 3CHILDERS D G,WU K.Gender recognition from speech,Part 2:coarse analysis[J].J.Acoust.Soc.Am.,1991,90 (4):1 851-1 856.
  • 4吕士楠,周同春,谢咏圭.用KLATT合成器合成汉语的初步研究[C]// 第二届全国人机语音通讯学术会议论文集.北京:金城出版社,1992:281-286.
  • 5MUSTAFA K,BRUCE I C.Robust formant tracking for continuous speech with speaker variability[J].IEEE Transactions on Audio,Speech and Language Processing,2006,14(2):435-444.
  • 6RAO A,KUMAERSAN R.On decomposing speech into modulated components[J].IEEE Transactions on Speech Audio Processing,2000,8(3):240-254.
  • 7BRUCE I C,KARKHANIS N V,YOUNG E D,et al.Robust formant tracking in noise[C]// Proceedings of the IEEE International Conference on Acoustic,Speech,and Signal Processing.[S.l.]:IEEE Press,2002,1:281-284.
  • 8KANG G S,FRANSEN L J.Application of line-spectrum pairs to low-bit-rate speech encoders[C]// Proceedings of IEEE International Conference on Acoustic,Speech,and Signal Processing.[S.l.]:IEEE Press,1985,10:244-247.
  • 9SOONG F K,JUANG B H.Line spectrum pair and speech data compression[C]// Proceedings of IEEE International Conference on Acoustic,Speech,and Signal Processing.[S.l.]:IEEE Press,1984,9(1):37-40.
  • 10PALIWAL K K.A study of LSP representation for speakerdependent and speaker-independent HMM-based speech recognition systems[C]// Proceedings of IEEE International Conference on Acoustic,Speech,and Signal Processing.[S.l.]:IEEE Press,1992,1:97-100.

共引文献2

同被引文献32

  • 1高莉琴.从维吾尔人学汉语看第二语言习得的几个问题[J].语言文字应用,1994(1):79-85. 被引量:5
  • 2徐俊,蔡莲红.面向情感转换的层次化韵律分析与建模[J].清华大学学报(自然科学版),2009(S1):1274-1277. 被引量:7
  • 3蔡莲红,崔丹丹,蔡锐.汉语普通话语音合成语料库TH-CoSS的建设和分析[J].中文信息学报,2007,21(2):94-99. 被引量:12
  • 4Zen H, Tokuda K, Black A W.Statistical parametric speech synthesis[J].Speech Communication,2009,51 ( 11 ) : 1039-1064.
  • 5Yamagishi J, Kobayashi T, Nakano Y, et al.Analysis of speaker adaptation algorithms for HMM-based speech syn- thesis and a constrained SMAPLR adaptation algorithm[J]. IEEE Transactions on Audio, Speech, and Language Process- ing, 2009, 17( 1 ) : 66-83.
  • 6Nose T, Tachibana M, Kobayashi T.HMM-based style con- trol for expressive speech synthesis with arbitrary speaker's voice using model adaptation[J].IEICE Trans on Inf & Syst, 2009, E92-D (3) : 489-497.
  • 7Yang Hongwu, Meng H M, Cai Lianhong.Modeling the acoustic correlates of expressive elements in text genres for expressive text-to-speech synthesis[C]//Proceedings of International Conference on Spoken Language Processing. Pittsburg, USA : [s.n.], 2006: 1806-1809.
  • 8Wu Zhiyong, Meng H M, Yang Hongwu, et al.Modeling the expressivity of input text semantics for chinese text-to-speech synthesis in a spoken dialog system[J].IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17 (8) : 1567-1577.
  • 9崔丹丹.情感语音分析与变换的研究[D].北京:清华大学,2007.
  • 10Guo Weitong, Yang Hongwu, Pei Dong, et al.Prosody con- version of Chinese northwest mandarin dialect based on five degree tone model[J].Intemational Journal of Digital Content Technology and its Applications, 2012, 6 (17): 323-332.

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部