期刊文献+

基于PAD三维情绪模型的情感语音韵律转换 被引量:3

Prosody conversion of emotional speech based on PAD three dimensional emotion model
在线阅读 下载PDF
导出
摘要 提出了一种基于PAD三维情绪模型的情感语音韵律转换方法。选取了11种典型情感,设计了文本语料,录制了语音语料,利用心理学的方法标注了语音语料的PAD值,利用五度字调模型对情感语音音节的基频曲线建模。在此基础上,利用广义回归神经网络(Generalized Regression Neural Network,GRNN)构建了一个情感语音韵律转换模型,根据情感的PAD值和语句的语境参数预测情感语音的韵律特征,并采用STRAIGHT算法实现了情感语音的转换。主观评测结果表明,提出的方法转换得到的11种情感语音,其平均EMOS(Emotional Mean Opinion Score)得分为3.6,能够表现出相应的情感。 This paper proposes a framework for prosody conversion of emotional speech based on PAD three dimensional emo- tion model. It designs an emotional speech corpus including 11 kinds of emotional utterances. Each utterance is labelled the emotional information with PAD value. A five-scale tone model is employed to model the pitch contour of emotional speech at the syllable level. It builds a Generalized Regression Neural Network (GRNN) based prosody conversion model to realize the transformation of pitch contour, duration and pause duration of emotional speech according to the PAD values of emotion and context information of text. Speech is then re-synthesized with the STRAIGHT algorithm by modifying pitch contour, duration and pause duration. Experimental results on Emotional Mean Opining Score (EMOS) demonstrate that the modified speeches achieve 3.6 of average Emotional Mean Opining Score (EMOS).
出处 《计算机工程与应用》 CSCD 2013年第5期230-235,共6页 Computer Engineering and Applications
基金 国家自然科学基金(No.61263036 No.60875015) 甘肃省自然科学基金(No.1107RJZA112 No.1208RJYA078)
关键词 PAD情绪模型 五度字调模型 广义回归神经网络(GRNN) STRAIGHT算法 韵律转换 PAD emotion model five degree tone model Generalized Regression Neural Network(GRNN) STRAIGHT algo- rithm prosody conversion
  • 相关文献

参考文献16

  • 1蔡莲红,贾珈,郑方.言语信息处理的进展[J].中文信息学报,2011,25(6):137-141. 被引量:3
  • 2Zen H, Tokuda K, Black A W.Statistical parametric speech synthesis[J].Speech Communication,2009,51 ( 11 ) : 1039-1064.
  • 3蔡莲红,崔丹丹,蔡锐.汉语普通话语音合成语料库TH-CoSS的建设和分析[J].中文信息学报,2007,21(2):94-99. 被引量:12
  • 4Yamagishi J, Kobayashi T, Nakano Y, et al.Analysis of speaker adaptation algorithms for HMM-based speech syn- thesis and a constrained SMAPLR adaptation algorithm[J]. IEEE Transactions on Audio, Speech, and Language Process- ing, 2009, 17( 1 ) : 66-83.
  • 5Nose T, Tachibana M, Kobayashi T.HMM-based style con- trol for expressive speech synthesis with arbitrary speaker's voice using model adaptation[J].IEICE Trans on Inf & Syst, 2009, E92-D (3) : 489-497.
  • 6徐俊,蔡莲红.面向情感转换的层次化韵律分析与建模[J].清华大学学报(自然科学版),2009(S1):1274-1277. 被引量:7
  • 7Yang Hongwu, Meng H M, Cai Lianhong.Modeling the acoustic correlates of expressive elements in text genres for expressive text-to-speech synthesis[C]//Proceedings of International Conference on Spoken Language Processing. Pittsburg, USA : [s.n.], 2006: 1806-1809.
  • 8Wu Zhiyong, Meng H M, Yang Hongwu, et al.Modeling the expressivity of input text semantics for chinese text-to-speech synthesis in a spoken dialog system[J].IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17 (8) : 1567-1577.
  • 9崔丹丹.情感语音分析与变换的研究[D].北京:清华大学,2007.
  • 10周慧,杨鸿武,蔡莲红.基于SVR的情感语音变换[J].西北师范大学学报(自然科学版),2009,45(1):62-66. 被引量:2

二级参考文献45

共引文献25

同被引文献16

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部