摘要
典型隐马尔可夫模型对初始参数非常敏感,采用随机参数训练隐马尔可夫模型时常陷入局部最优,应用于W eb信息抽取时效果不佳.文中提出基于模拟退火算法与隐马尔可夫模型的W eb信息抽取算法.通过实验比较选择最佳的模拟退火算法参数,结合Baum-W elch算法优化隐马尔可夫模型并应用于W eb信息抽取.实验结果表明新算法在信息抽取的精确率和召回率都有明显的提高.
Typical HMM is sensitive to the initial model parameters and often leads to sub-optimal when training it with random parameters.It is ineffective when extracting Web information with typical HMM.The artical proposes web information extraction algorithm based on SA and HMM.The algorithm chooses the best SA parameters by experiment and optimizes HMM combining Baum-Welch during the course of extracting Web information.Experimental results show that the new algorithm significantly improves the performance in precision and recall.
出处
《南华大学学报(自然科学版)》
2011年第1期70-74,共5页
Journal of University of South China:Science and Technology
基金
湖南省教育厅基金资助项目(O7C637)
关键词
模拟退火算法
隐马尔可夫模型
WEB信息抽取
simulated annealing algorithm
hidden Markov model
Web information extraction