摘要
给出了一种基于网页内容相似度和网页之间链接关系的社区发现方法.该方法不仅考虑了网页之间的超链接关系,而且着重考虑了网页在内容上的相似度并克服了传统社区发现算法忽略网页内容的局限性,使发现的社区在内容上更相关.在原始社区的基础上对其进行动态添加,将网络中新出现的与原始社区中的网页存在链接关系同时与主题相关的网页加入到原始社区.实验表明,此方法可以有效地应用于网络的社区发现,使发现的社区在内容上更相关.
An algorithm for community identification based on the Web pages contents similarity and the link relation between the Web pages was proposed.The algorithm not only considered the hyperlinks between Web pages but focused on the content similarity of Web pages.This method overcame the limitations of ignoring the content of Web pages in traditional community discovery algorithms,so that the communities founded in the content were more relevant.In addition,the paper added the new members based on the original community dynamically,and added the new Web pages which linked to the Web pages of original community related to the theme into the original community.Experiments showed that the method was applied to community discovery in the network,and the community was more relevant in the content.
出处
《郑州大学学报(理学版)》
CAS
北大核心
2011年第1期75-79,共5页
Journal of Zhengzhou University:Natural Science Edition
基金
河北省教育厅科学研究重点项目
编号ZH200804
关键词
社区发现
相似度
链接关系
community identification
similarity
link relation