摘要
子空间聚类是聚类研究领域的一个重要分支和研究热点,用于解决高维聚类分析面临的数据稀疏问题。提出一种基于k最相似聚类的子空间聚类算法。该算法使用一种聚类间相似度度量方法保留k最相似聚类,在不同子空间上采用不同局部密度阈值,通过k最相似聚类确定子空间搜索方向。将处理的数据类型扩展到连续型和分类型,可以有效处理高维数据聚类问题。实验结果证明,与CLIQUE和SUBCLU相比,该算法具有更好的聚类效果。
Subspace clustering is an important part and research hotspot in clustering research, which resolves the problem of clustering sparse data in high dimensional data environment. A subspace clustering algorithm based on k most similar clustering is presented. This algorithm holds the k most similar clustering by the similarity of the clusters, discovers the different subspace through the different local density threshold, ascertains the subspace search direction by the k most similar clustering and clusters both continuous data and categorical data. The high dimensional data can be effectively clustered in this algorithm. Experimental results show that this algorithm is more effective in clustering than CLIQUE and SUBCLU.
出处
《计算机工程》
CAS
CSCD
北大核心
2009年第14期4-6,共3页
Computer Engineering
基金
国家自然科学基金资助项目(70671016
60873180
60673066)
关键词
聚类算法
子空间聚类
高维数据
clustering algorithm
subspace clustering
high dimensional data