摘要
话题检测技术可以及时发现网络舆情热点话题和突发性事件,可以持续跟踪话题,实时掌握网络舆情事件的动向。文本聚类算法是实现话题检测与跟踪的重要方法,传统K-Means聚类算法结构简单,收敛速度快,但存在对初始聚类中心选择敏感、容易陷入局部最优解等局限。引入差分进化算法对K-Means算法进行改进,既具有差分进化算法的全局优化能力,又保留了K-Means算法简单高效的优点,并兼顾了网络舆情话题检测的准确性和实时性。实验表明,改进后算法的误检率、漏检率和耗费函数都有明显改善,提高了话题检测准确度上的有效性和实用性。
Topic detection technology can discover hot topics and emergencies of online public opinion in time,keep track of topics,and grasp the trend of online public opinion events in real time.The text clustering algorithm is an important method for topic detection and tracking.The traditional K-Means clustering algorithm has a simple structure and fast convergence speed,but it has limitations such as being sensitive to the selection of initial clustering centers and easily falling into local optimal solutions.The introduction of differential evolutionary algorithm to improve the K-means algorithm not only has the global optimization ability of differential evolutionary algorithm,but also retains the advantages of simple and efficient K-means algorithm and gives consideration to the accuracy and real-time of online public opinion topic detection.Experiments show that the false detection rate,missed detection rate and cost function of the improved algorithm are significantly improved,which improves the validity and practicability of topic detection accuracy.
作者
李丽蓉
LI Li-rong(Shanxi Police College,Taiyuan Shanxi 030401)
出处
《山西警察学院学报》
2021年第1期69-72,共4页
Journal of Shanxi Police College
基金
山西省“1331工程”重点学科建设计划经费资助项目(1331KSC)
山西警察学院创新团队资助项目。
关键词
网络舆情
文本聚类
话题检测
Internet public opinion
text clustering
topic detection