期刊文献+

基于深度神经网络的维吾尔语语音识别 被引量:13

Uyghur speech recognition based on deep neural network
在线阅读 下载PDF
导出
摘要 目前的语音识别主要采用隐马尔可夫模型去实现,考虑三音子后,模型参数巨增,在训练数据有限的状态下,模型参数得不到很好的训练,影响语音识别率。为提高语音识别率,提出基于深度神经网络的语音识别方法。以kaldi为测试平台,对一个含有4隐层的神经网络进行训练,利用该模型进行维吾尔语语音识别。实验结果表明,相比基本单音子隐马尔科夫模型和考虑三音子后的隐马尔科夫模型,深度神经网络模型使维吾尔语语音识别错误率分别降低了31.09%和8.68%,且现存一切模型优化算法在此模型中依然有效。 Currently speech recognition is mainly achieved by using hidden Markov models. However, after taking the triphone model into account, the scale of parameters greatly increases, in the circumstances of limited training data, the model parameters are not well trained, thus affecting the speech recognition rate. To improve the speech recognition rate, the method for speech recognition based on deep neural network was proposed. A neural network containing four hidden layers was trained on the kaldi platform, and the model was used to deal with the Uyghur speech recognition. Experimental results show that the error in Uy- ghur speech recognition is reduced by 31.09 % and 8.68 % respectively using the deep the neural network model compared to that using the basic tone sub-HMM and HMM triphone. And all models of existing optimization algorithm are still valid in this model.
出处 《计算机工程与设计》 北大核心 2015年第8期2239-2244,共6页 Computer Engineering and Design
基金 国家自然科学基金项目(61365005 60965002) 新疆大学博士毕业生科研启动基金项目(2014211B009) 新疆大学自治区自然科学基金项目(BS120124)
关键词 语音识别 模型 深度神经网络 三音子 隐马尔可夫 speech recognition model deep neural network triphone hidden Markov model
  • 相关文献

参考文献10

  • 1那斯尔江·吐尔逊,吾守尔·斯拉木.基于HMM的维吾尔语连续语音识别系统[D].乌鲁木齐:新疆大学,2008:272-278.
  • 2Andrew Ng, Jiquan Ngiam, Chuan Yu Foo, et al. Unsaper- vised feature learning and deep learning [R]. deeplearning. stanford, edu/wiki/inde php, 2013.
  • 3YU D Deng L. Deep learning and its relevance to signal and information processing [J]. IEEE Signal Processing Magazine, 2011, 28 (1): 145 154.
  • 4George Dahl, Yu D, Deng L, et al. Context-dependent Pre- trained deep neural networks to large vocabulary speech recogni- tion [J]. IEEE Transaction on Audio, Speech and Language Processing, 2012, 20 (1): 34-42.
  • 5Glorot X, Bengio Y. Understanding the difficulty of training deep feed-forward neural networks [J]. JMLP WCP, 2010, 9: 249-256.
  • 6Erhan D, Bengio Y, Courvelle A, et al. Why does unsuper vised pre-training help deep learning [J] Machine Learning Re-search, 2010, 12: 201-208.
  • 7Hinton G. A practical guide to training restricted Boltzmann machines [G]. LNCS 7700: Neural Networks: Tricks of the Trade, 2010.
  • 8Yu D, Deng L. Efficient and effective algorithms for training single-hidden-layer neural network [J]. Pattern Recognition Letters, 2012, 33 (5): 554-558.
  • 9Salakhutdinov R, Hinton G. A better way to pretrain deep Boltzmann machines [ C ] //NIPS Proceedings, 2012.. 2456-2464.
  • 10Povey D, Burget L. The subspace Gaussian mixture model-A structured model for speech recognition [ J ]. Computer Speech Languange, 2011, 25 (2): 404-439.

同被引文献82

引证文献13

二级引证文献89

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部