摘要
【目的/意义】聚类网络舆情事件,不仅使得舆情信息更有层次和条理,还能辅助舆情事件个性化推荐等后续研究。【方法/过程】融合网络表示学习与K-means,经过舆情事件收集、事件共现频率分析、事件降维映射、聚类分析四个阶段达到舆情事件聚类的目的。收集舆情事件后根据事件间共现关系构造事件共现矩阵,运用NRL相关算法获得舆情事件的低维向量表示;然后运用K-means进行聚类:首先确定分组数量、划分初始簇;根据该类别中事件低维向量表示的均值更新类别中心;迭代至聚类完成。【结果/结论】运用蚁坊舆情监测软件已分类的220起舆情事件进行实证,发现融入NRL的K-means聚类能够达到较好的聚类效果。【创新/局限】以挖掘舆情事件为基础,创新提出融合网络表示学习的k-means聚类方法,获得条理清晰的舆情事件。然而个人研究可获取的数据数量有限,难以达成最优聚类效果,互联网信息平台拥有海量用户数据,可以达成更好的聚类效果以便个性化推荐等后续研究。
【Purpose/significance】Clustering network public opinion events not only makes the public opinion information more hierarchical and organized,but also assists in follow-up research such as personalized public opinion event recommendation.【Method/process】Combining Network Representation Learning(NRL)with K-means,the purpose of public opinion events clustering is achieved through four stages:public opinion collection,event co-occurrence frequency analysis,event dimensionality reduction mapping and cluster analysis.After obtaining the public opinion event,the event co-occurrence matrix is obtained according to the co-occurrence relationship between events,and the NRL correlation algorithm is used to obtain the low-dimensional vector representation of the public opinion event.Then K-means is used to determine the number of groups,the initial cluster is divided,updating the category center based on the mean represented by the low-dimensional vector of events in the category;and finally,the iteration to clustering is completed.【Result/conclusion】Using the 220 lyric events classified by the ant square public opinion monitoring software,it is found that K-means clustering integrated into NRL can achieve better clustering effect.【Innovation/limitation】Based on the mining of public opinion events,we innovatively propose a k-means clustering method that integrates network representation learning to obtain clear public opinion events.However,the amount of data available for personal research is limited,and it is difficult to achieve the optimal clustering effect.Internet information platforms have massive user data,which can achieve better clustering results for personalized recommendations and other follow-up research.
作者
田世海
董月文
王健
TIAN Shi-hai;DONG Yue-wen;WANG Jian(School of Economics and Management,Harbin University of Science and Technology,Harbin 150040,China;School of Mechanical&Power Engineering,Harbin 150040,China)
出处
《情报科学》
CSSCI
北大核心
2021年第2期129-136,共8页
Information Science
基金
黑龙江省自然科学基金资助项目“融媒体时代突发事件网络舆情引导机制研究”(LH2019G017)
黑龙江省社会科学研究规划项目“黑龙江省大数据产业联盟云服务模式研究”(16GLB01)。