摘要
传统聚类算法一般针对的是确定数据,无法解决不确定数据的聚类问题;现有基于密度的不确定数据聚类算法存在参数敏感且计算率低的问题.对此,在引进新的不确定数据相异度函数、最优近邻、局部密度和互包含概念的基础上,提出解决不确定数据聚类问题的不确定数据的最优k近邻和局部密度聚类(OLUC)算法.该算法不仅能降低参数敏感性,提高计算效率,而且具有动态自适应优化k近邻,快速发现聚类中心和除噪优化的能力.实验结果表明,所提出的算法对无论是否存在噪声的不确定数据集都效果良好.
Traditional clustering algorithms aim to certain data in general, which cannot solve the clustering problem for uncertain data. The existing density-based clustering algorithms for uncertain data have the problems that parameters are too sensitive and the computational efficiency is low. Therefore, an algorithm, named optimal k-nearest neighbors and local density-based clustering algorithm for uncertain data(OLUC), is proposed to solve the clustering problem for uncertain data by introducing concepts of new dissimilarity function for uncertain data, optimal k-nearest neighbors, local density and mutual inclusion relation. The algorithm not only can reduce the sensitivity of parameters and improve the computational efficiency, but also has the abilities of optimizing k-nearest neighbors in the dynamic adaptive way, deciding cluster center quickly and optimizing denoising. The experimental results show that the algorithm is effective on clustering for uncertain data whatever with noise or without noise, and achieves good results.
出处
《控制与决策》
EI
CSCD
北大核心
2016年第3期541-546,共6页
Control and Decision
基金
水利部公益性行业科研专项基金项目(201401044)
关键词
K近邻
局部密度
不确定数据
聚类算法
k-nearest neighbors
local density
uncertain data: clustering algorithm