摘要
本文借鉴了关键词轮排原理,结合相关统计模型,从正反两个方面对原始抽词词典进行压缩和优选,以达到降维和准确表达主题的目的;并基于海量新闻文本进行了自动分类测试,结果表明该约简算法在构造核心关键词词典方面是可行的。
Using the principles of rotated keywords for reference and in combination with the relevant statistical models, this article compresses and optimizes the original word-extracting dictionary in both positive and negative directions so as to achieve the goal of reducing the number of dimensions and accurately expressing the themes. The article conducts an experimental Study on automatic categorization by using enormous news texts. The result shows that the reduction arithmetic is feasible in constructing the kernel keyword dictionary.
出处
《情报理论与实践》
CSSCI
北大核心
2007年第5期678-680,共3页
Information Studies:Theory & Application
关键词
抽词词典
关键词轮排
自动分类
算法
word-extracting dictionary
rotated keywords
automatic classification
algorithm