摘要
目前的语音识别主要采用隐马尔可夫模型去实现,考虑三音子后,模型参数巨增,在训练数据有限的状态下,模型参数得不到很好的训练,影响语音识别率。为提高语音识别率,提出基于深度神经网络的语音识别方法。以kaldi为测试平台,对一个含有4隐层的神经网络进行训练,利用该模型进行维吾尔语语音识别。实验结果表明,相比基本单音子隐马尔科夫模型和考虑三音子后的隐马尔科夫模型,深度神经网络模型使维吾尔语语音识别错误率分别降低了31.09%和8.68%,且现存一切模型优化算法在此模型中依然有效。
Currently speech recognition is mainly achieved by using hidden Markov models. However, after taking the triphone model into account, the scale of parameters greatly increases, in the circumstances of limited training data, the model parameters are not well trained, thus affecting the speech recognition rate. To improve the speech recognition rate, the method for speech recognition based on deep neural network was proposed. A neural network containing four hidden layers was trained on the kaldi platform, and the model was used to deal with the Uyghur speech recognition. Experimental results show that the error in Uy- ghur speech recognition is reduced by 31.09 % and 8.68 % respectively using the deep the neural network model compared to that using the basic tone sub-HMM and HMM triphone. And all models of existing optimization algorithm are still valid in this model.
出处
《计算机工程与设计》
北大核心
2015年第8期2239-2244,共6页
Computer Engineering and Design
基金
国家自然科学基金项目(61365005
60965002)
新疆大学博士毕业生科研启动基金项目(2014211B009)
新疆大学自治区自然科学基金项目(BS120124)
关键词
语音识别
模型
深度神经网络
三音子
隐马尔可夫
speech recognition
model
deep neural network
triphone
hidden Markov model