摘要
提出一种高效的基于密度单元的自扩展聚类算法SECDU.首先将数据空间等分为若干个密度单元,再根据数据点的位置将其划分到所属的密度单元中,然后针对密度单元进行聚类.聚类首先产生在数据最密集的区域,然后向周围低密度区域延伸.聚类在延伸的过程中体积逐渐增大,密度逐渐减小,直到聚类的密度达到一个事先规定的限度时为止.算法在保留原有数据分布特性的前提下利用密度单元对数据进行压缩,并在保证具有较好效果的前提下大幅度地提高了聚类的速度.
An efficient self-expanded clustering algorithm based on density units (SECDU) is presented. The whole data space is divided into several density units equally. Each data point is put into a density unit according to the data point possition. The area with the highest data density is the starting point of clustering and it is expanded to the low-density area. The whole process will not stop until densities of all clusters reduce to the threshold set in advance. By compressing data into data units, SECDU can cluster large dataset at a high speed without destroying distribution feature.
出处
《控制与决策》
EI
CSCD
北大核心
2006年第9期974-978,共5页
Control and Decision
基金
国家自然科学基金项目(60273079
60573089)
关键词
聚类分析
密度单元
聚类空间
聚类算法
Clustering analysis
Density unit
Cluster space
Cluster algorithm