期刊文献+

一种基于NMF_(SC)的文本聚类方法

Document Clustering Method Based on NMF_(SC)
在线阅读 下载PDF
导出
摘要 通过分析文本的特征,提出了一种基于稀疏约束非负矩阵分解(NMFSC)的文本聚类新方法。该方法用NMFSC分解词-文本矩阵来降低特征空间的维度,并依照稀疏约束更好地控制稀疏度,然后利用簇中文本的相似性进一步细化簇。实验表明,与基于k-means的文本聚类方法和基于NMF的文本聚类方法相比,此方法具有较高的归一化互信息值(NMI),从而具有良好的聚类性能。 Through analyzing the characteristics of the text, a novel text clustering approach based on Non-negative Matrix Factorization with sparseness constraint (NMFSC) is presented. The method uses NMFSC decomposing word-text matrix to reduce the dimension of the feature space, and better controls sparsity with sparseness constraint, and then further refines clusters by using the similarity of documents in clusters. Compared with text clustering method based on k-means and text clustering method based on NMF, the results of experiment show that the method has high value of the normalized mutual information, thus it has good clustering performance.
作者 王永贵 高月
出处 《计算机系统应用》 2011年第9期78-81,156,共5页 Computer Systems & Applications
关键词 文本聚类 细化簇 非负矩阵分解 稀疏表示 归一化互信息值 text clustering refine clusters non-negative matrix factorization sparse representation normalized mutual information
  • 相关文献

参考文献3

二级参考文献14

  • 1[1]Jain A K, Dubes R C. Algorithms for Clustering Data. Prentice Hall, 1988
  • 2[2]Inderjit S D, Dharmendra S M. Concept Decompositions for Large Sparse Text Using Clustering. Machine Learning, 2001,42(1): 143-175
  • 3[3]Hinneburg A, Aggarwal C C, Keim D A. What is the Nearest Neighbor in High Dimensional Spaces. In: Proceedings of the VLDB Confe- rence, 2001
  • 4[4]Lee D, Seung H. Learning the Parts of Objects by Non-negative Matrix Factorization. Nature, 1999, 401:788-791
  • 5[5]Lee D, Seung H. Algorithms for Non-negative Matrix Factorization. Adv. Neural Info. Proc. Syst., 2001,13:556-562
  • 6[6]Inderjit S D, Dharmendra S M. Concept Decompositions for Large Sparse Text Using Clustering. Machine Learning, 2001, 42(1):143-175
  • 7J MacQueen. Some methods for classification and analysis of multivariate observation. In: Proc of the 5th Berkeley Symp Math Statist and Prob 1. California; University of California Press,1967. 281~297
  • 8L Kaufman, P J Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley & Sons,1990
  • 9M Ankerst, M M Breunig, H P Kriegel, et al. OPTICS:Ordering points to identify the clustering structure. In: Proc of the 1999 ACM SIGMOD Int'l Conf on Management of Data (SIGMOD' 99). New York: ACM Press, 1999. 164~169
  • 10A Hotho, G Stumme. Conceptual clustering of text clusters.FGML Workshop, Hannover, 2002

共引文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部