期刊文献+

一种基于共享近邻亲和度的聚类算法 被引量:3

Shared nearest neighbor affinity based clustering algorithm
在线阅读 下载PDF
导出
摘要 为解决密度聚类算法在处理高维和多密度数据集时聚类结果不精确的问题,提出一种基于共享近邻亲和度(SNNA)的聚类算法。该算法引入k近邻和共享近邻,定义共享近邻亲和度作为对象的局部密度度量。算法首先根据亲和度来提取核心点,然后利用广度优先搜索算法对核心点进行聚类,最后对非核心点进行指派即完成整个数据集的聚类。实验结果表明,该算法能够发现任意形状、大小、密度的聚类;与同类算法相比,SNNA算法在处理高维数据时具有较高的聚类准确率。 In order to solve the problem of inaccurate clustering results when dealing with high-dimensional and multidensity datasets,a Shared Nearest Neighbor Affinity(SNNA)based clustering algorithm is put forward.The algorithm incorporates k nearest neighbor and shared nearest neighbor,and defines shared neighbor affinity as the local density measure of the object.The algorithm firstly extracts the core points according to the affinity,then uses the breadth first search algorithm to cluster the core points,and finally assigns the non-core points to the right cluster to complete the clustering of the whole data set.Experimental results show that the algorithm can find clusters of arbitrary shape,size and density.Compared with other similar algorithms,SNNA has higher clustering accuracy when dealing with high-dimensional data.
作者 邱保志 辛杭 QIU Baozhi;XIN Hang(School of Information Engineering,Zhengzhou University,Zhengzhou 450001,China)
出处 《计算机工程与应用》 CSCD 北大核心 2018年第18期184-187,222,共5页 Computer Engineering and Applications
基金 河南省基础与前沿基金(No.152300410191)
关键词 聚类 密度 共享近邻 亲和度 数据挖掘 clustering density shared nearest neighbor affinity data mining
  • 相关文献

参考文献6

二级参考文献76

  • 1Han Jia-wei,Kamber M.Data ming:Concepts and techniques[M].[S.l] :Morgan Kaufmann Publishers Press.2000.
  • 2Zhang Tian,Ramakrishnan R,Linvy M.BIRCH:An efficient data clustering method for large databases[C] //Proc of 1996 ACM-SIG-MOD Int Conf on Management of Data,Montreal,Quebec,1996:103-114.
  • 3Guha S,Rastogi R,Shim K.CURE:An efficient clustering algorithm for large database[C] //SIGMOD'98,Seattle,Washington,1998:73-84.
  • 4Hsu Chih-Ming,Chen Ming-Syan.Subspace clustering of high dimensional spatial data with noises[C] //PAKDD 2004,LNAI 3056,2004:31-40.
  • 5Karypis G,Han E H,Kumar V.Chameleon:A hierarchical clustering algorithm using dynamic modeling[J].IEEE Computer,1999,32(8):68-75.
  • 6Han JW, Kambr M. Data Mining Concepts and Techniques. Beijing: Higher Education Press, 2001. 145-176.
  • 7Kaufan L, Rousseeuw PJ. Finding Groups in Data: an Introduction to Cluster Analysis. New York: John Wiley & Sons, 1990.
  • 8Ester M, Kriegel HP, Sander J, Xu X. A density based algorithm for discovering clusters in large spatial databases with noise. In:Simoudis E, Han JW, Fayyad UM, eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining.Portland: AAAI Press, 1996. 226-231.
  • 9Guha S, Rastogi R, Shim K. CURE: an efficient clustering algorithm for large databases. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. "73-84.
  • 10Agrawal R, Gehrke J, Gunopolos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining application. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data.Seattle: ACM Press, 1998.94-105.

共引文献144

同被引文献17

引证文献3

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部