摘要
在异质信息网络(HINs)中搜索包含给定查询节点的社区具有广泛的应用价值,如好友推荐、疫情监控等。现有HINs社区搜索方法大多基于预定义的子图模式对社区的拓扑结构施加一个严格的要求,忽略了节点间的属性相似性,导致结构关系弱而属性相似性高的社区难以定位,并且采用的全局搜索模式难以有效处理大规模的网络数据。为解决这些问题,首先设计解耦图神经网络和基于元路径的局部模块度,分别用于度量节点间的属性相似性和结构内聚性,并利用0/1背包问题优化属性和结构两种凝聚性度量指标,定义了最有价值的c大小社区搜索问题,进而提出了一种基于解耦图神经网络的价值最大化社区搜索模型,执行3个阶段的搜索过程。第一阶段,依据查询信息与元路径,构造候选子图,将搜索范围控制在查询节点的局部范围内,保证整个模型的搜索效率;第二阶段,利用解耦图神经网络,融合异质图信息和用户标签信息,计算节点间的属性相似度;第三阶段,根据社区定义以及凝聚性度量指标,设计贪心算法查找属性相似度高且结构凝聚的c大小社区。最后,在真实的同质和异质网络数据集上测试了搜索模型的性能,大量实验结果验证了模型的有效性和高效性。
Searching the community containing a given query node in heterogeneous information networks(HINs)has a wide range of application values,such as friend recommendation,epidemic monitoring and so on.However,most of the existing HINs community search methods impose strict requirements on the topology of the community based on the predefined subgraph pattern,ignoring the attribute similarity between nodes,which will be difficult to locate the community with weak structural relationship and high attribute similarity.And the global search mode is difficult to effectively deal with large-scale network data.To solve these problems,we design disentangled graph neural network and the local modularity based on meta path to measure the attribute similarity and structural cohesion between nodes respectively.Moreover,we use the 0/1 knapsack problem to optimize the impact of the attribute and structure on the community,define the most valuable c-size community search problem,and then propose a value maximization community search algorithm based on disentangled graph neural network to perform a three-stage search process.In the first stage,we construct candidate subgraphs according to the query in-formation and meta-path,control the search range within the local range of the query vertex to ensure the search efficiency of the whole algorithm.In the second stage,we use the disentangled graph neural network to fuse the heterogeneous information and user label information to calculate the attribute similarity between nodes.In the third stage,we design a greedy algorithm to find the c-size community with high attribute similarity and structural cohesion according to the community definition and cohesion measurement indicator.Finally,we test the performance of algorithm on real homogeneous and heterogeneous data sets,and a large number of experimental results demonstrate the effectiveness and efficiency of the proposed model.
作者
陈伟
周丽华
王亚峰
王丽珍
陈红梅
CHEN Wei;ZHOU Lihua;WANG Yafeng;WANG Lizhen;CHEN Hongmei(School of Information Science and Engineering,Yunnan University,Kunming 650500,China)
出处
《计算机科学》
CSCD
北大核心
2024年第3期90-101,共12页
Computer Science
基金
国家自然科学基金(62062066,61762090,61966036,62276227)
云南省基础研究计划重点项目(202201AS070015)
云南省智能系统与计算重点实验室项目(202205AG070003)
云南省教育厅区块链与数据安全治理工程研究中心项目
云南省物联网技术与应用大学重点实验室项目。
关键词
异质信息网络
社区搜索
解耦图神经网络
元路径
局部模块度
Heterogeneous information networks
Community search
Disentangled graph neural network
Meta-paths
Local mo-dularity