期刊文献+

一种有效的K-means聚类中心初始化方法 被引量:87

Effective method for cluster centers' initialization in K-means clustering
在线阅读 下载PDF
导出
摘要 传统K-means算法由于随机选取初始聚类中心,使得聚类结果波动性大;已有的最大最小距离法选取初始聚类中心过于稠密,容易造成聚类冲突现象。针对以上问题,对最大最小距离法进行了改进,提出了最大距离积法。该方法在基于密度概念的基础上,选取到所有已初始化聚类中心距离乘积最大的高密度点作为当前聚类中心。理论分析与对比实验结果表明,此方法相对于传统K-means算法和最大最小距离法有更快的收敛速度、更高的准确率和更强的稳定性。 Initializing cluster centers randomly,traditional K-means algorithm leads to great fluctuations in the clustering results.The existing max-min distance algorithm,indeed,has rather dense cluster centers,which may easily bring about clustering conflicts.To solve these problems,this paper regarded the existing max-min distance algorithm as the thinking foundation and proposed the maximum distances product algorithm.Based on the theory of density-based clustering,the maximum distances product algorithm selected each point which had maximum product of distances between itself and all other initialized clustering centers.Theory analysis and experimental results show that compared with traditional K-means algorithm and max-min distance algorithm,the maximum distances product algorithm can result in faster convergence speed,higher accuracy,greater stability.
出处 《计算机应用研究》 CSCD 北大核心 2011年第11期4188-4190,共3页 Application Research of Computers
基金 重庆市科委基金资助项目(2008BB2191)
关键词 K-均值算法 基于密度 初始聚类中心 最大最小距离 最大距离积 K-means algorithm density-based clustering initial clustering centers max-min distance maximum distances product
  • 相关文献

参考文献9

二级参考文献31

  • 1荆丰伟,刘冀伟,王淑盛.改进的K-均值算法在岩相识别中的应用[J].微计算机信息,2004,20(7):41-42. 被引量:5
  • 2袁方,孟增辉,于戈.对k-means聚类算法的改进[J].计算机工程与应用,2004,40(36):177-178. 被引量:48
  • 3Guha S,Rastogi R,Shim K.Cure:an efficient clustering algorithm for large database[C]//Proc of ACM-SIGMOND lnt Conf Managemerit on Data, Seattle, Washington, 1998 . 73-84.
  • 4Ester M,Kriegel H P,Sander J.A density-based algorithm tier discovering chlsters in large spatial databases with noise[C]//Proc 2nd Int Conf on Knowledge Discovery and Data Mining.Portland, 1999.20:226-231.
  • 5范明,孟小峰.数据挖掘:概念与应用[M].北京:机械工业出版社,2004.
  • 6DUDA R O,HART P E.Pattern classification and scene analysis[M].New York:John Wiley and Sons,1973.
  • 7FABER V.Clustering and the continuous K-means algorithm[EB/OL].[2009-10-03].http://library.lanl.gov/cgi-bin/ getfilefi00412967.pdf.
  • 8STEINBACH M,KARYPIS G,KUMAR V.A comparison of document clustering techniques[EB/OL].[2009-10-03].http://cs.fit.edu/~pkc/classes/ml-internet/papers/steinbach00tr.pdf.
  • 9SALTON G,WONG A,YANG C S.A vector space model for automatic indexing[J].Communications of the ACM,1975,18(5):613-620.
  • 10YU C,OOI B C,TAN K L,et al.Indexing the distance:An efficient method to KNN[C] // Proceedings of the 27th International Conference on Very Large Data Bases.Roma:Morgan Kauimann Publishers,2001:421-430.

共引文献305

同被引文献695

引证文献87

二级引证文献676

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部