期刊文献+

基于轮廓系数的聚类有效性分析 被引量:123

Clustering validity analysis based on silhouette coefficient
在线阅读 下载PDF
导出
摘要 针对聚类结果进行有效性研究的方法有多种。通过对多种不同聚类有效性分析方法的比较,提出了一种新的基于轮廓系数的聚类有效性分析方法,并将其应用于K-m eans算法的评测中。与其他有效性分析方法相比,该方法可以更好实现对于聚类效果的判断,在标准数据集上的实验结果有效地验证了这点。并进一步将此有效性分析方法应用于文本聚类。 Several methods were used to study the validity of clustering result.According to the comparison of many different methods,a novel method called silhouette coefficient was proposed in this paper and was applied to evaluate the K-means algorithm.This method could achieve the better judgement for the clustering effect than the others.Finally,the extensive experiments performed on standard dataset verify the effectiveness of the proposed method.
出处 《计算机应用》 CSCD 北大核心 2010年第12期139-141,198,共4页 journal of Computer Applications
基金 国家自然科学基金资助项目(60903099)
关键词 聚类 K均值算法 轮廓系数 有效性分析 无监督 clustering K-means algorithm silhouette coefficient validity analysis unsupervised
  • 相关文献

参考文献9

  • 1BEZDEK J C. Pattern recognition with fuzzy objective function algorithms [ M]. New York: Plenum Press, 1981.
  • 2HAND D, MANNILA H, SMYTH P. Principles of data mining [ M]. Cambridge: MIT Press, 2001.
  • 3TAN PANG-NING, STEINBACH M, KUMAR V. Introduction to data mining [M]. Boston, MA: Addison-Wesley, 2006.
  • 4CHEN DUO, LI XUE. An adaptive cluster validity index for the fuzzy C-means [ J]. International Journal of Computer Science and Network Security, 2007, 7(2) : 146 - 156.
  • 5KAUFMAN L, ROUSSEEUW P J. Finding groups in data: an introduction to cluster analysis [ M]. New York: John Wiley & Sons, 1990.
  • 6UCI Machine Leaming Repository [ EB/OL]. [ 2010 -02 -25]. http://www, isc. uci. edu/- mlearrc/MLRepository, html.
  • 7姚清耘,刘功申,李翔.基于向量空间模型的文本聚类算法[J].计算机工程,2008,34(18):39-41. 被引量:50
  • 8彭京,杨冬青,唐世渭,付艳,蒋汉奎.一种基于语义内积空间模型的文本聚类算法[J].计算机学报,2007,30(8):1354-1363. 被引量:45
  • 9刘涛,吴功宜,陈正.一种高效的用于文本聚类的无监督特征选择算法[J].计算机研究与发展,2005,42(3):381-386. 被引量:37

二级参考文献28

  • 1赵军,金千里,徐波.面向文本检索的语义计算[J].计算机学报,2005,28(12):2068-2078. 被引量:28
  • 2董振东 董强.[EB/OL].知网.http://www.keenage.com,.
  • 3C. C. Aggrawal, P. S. Yu. Finding generalized projected clustersin high dimensional spaces. The SIGMOD'00, Dallas, 2000.
  • 4M. Dash, H. Liu. Feature selection for clustering. The PAKDD-00, Kyoto, 2000.
  • 5F. Sebastiani. Machine learning in automated text categorization.ACM Computin Surveys, 2002, 34(1): 1--47.
  • 6Y. Yang, J. O. Pedersen. A comparative study on featureselection in text categorization. The ICML97, Nashville, 1997.
  • 7M. Rogati, Y. Yang. High performance feature selection for text categorization. The CIKM-02, Mclean, 2002.
  • 8L. Tao, L. Shengping, C. Zheng, et al.An evaluation on feature selection for text clustering. The ICML03, Washington,2003.
  • 9王永成.中文信息处理技术及其基础[M].上海:上海交通大学出版社,1990..
  • 10Pelleg D,Moore A.X-means:Extending K-means with efficient estimation of the number of clusters//Proceedings of the 17th International Conference on Machine Learning (ICML).Palo Alto,2000:727-734

共引文献124

同被引文献1261

引证文献123

二级引证文献819

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部