期刊文献+

基于条件随机场与信息熵的特定领域概念发现 被引量:5

New words discovery method based on CRF and information entropy in specific domain
在线阅读 下载PDF
导出
摘要 针对特定领域内自动化识别既有概念和发现新概念的问题,提出一种基于条件随机场和信息熵的抽取方法。通过使用条件随机场对文本中的概念词进行边界预测,与词典中的概念对比,筛选出新概念的候选项并找出其大概位置,然后由互信息和左右熵分别判断概念窗口内的概念内部结合度和概念边界自由度,从而发现新的专业概念。实验表明,使用该方法进行概念发现比单独使用条件随机场的方法有更好的效果,基于字和词的模型概念发现的准确率分别提升了20.06%和46.54%。 Aiming at the problem of automatic identification of existing concepts and discovering new concepts in a specific field,this paper proposed a new words discovery method based on conditional random field(CRF)and information entropy.This method used CRF to predict the boundary of conceptual words in text,selected the candidates of the new concept with the comparison to the existing concepts in the dictionary and found the probably location in text.Then it used the mutual information and the left and right entropy to judge the internal integration degree and the boundary freedom of the concept in the concept window for discovering new professional concepts.Experiments show that the concept discovery using the proposed method has a better effect than the method of using CRF alone.The accuracy of the concept discovery based on word and words model is respectively improved by 20.06%and 46.54%.
作者 付瑶 万静 邢立栋 Fu Yao;Wan Jing;Xing Lidong(College of Information Science&Technology,Beijing University of Chemical Technology,Beijing 100029,China;Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China)
出处 《计算机应用研究》 CSCD 北大核心 2020年第3期708-711,730,共5页 Application Research of Computers
基金 国家科技支撑计划资助项目(2015BAK03B04)。
关键词 概念识别 新概念发现 条件随机场 信息熵 特定领域 concept recognition new concept discovery conditional random field information entropy specific field
  • 相关文献

参考文献10

二级参考文献95

共引文献185

同被引文献118

引证文献5

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部