摘要
DBSCAN聚类算法使用固定的Eps和min Pts,处理多密度的数据效果不理想,并且算法的时间复杂度为O(N2)。针对以上问题,提出一种基于区域划分的DBSCAN多密度聚类算法。算法利用网格相对密度差把数据空间划分成密度不同的区域,每个区域的Eps根据该区域的密度计算自动获得,并利用DBSCAN算法进行聚类,提升了DBSCAN的精度;避免了DBSCAN在查找密度相连时需要遍历所有数据的不足,从而改善了算法效率。实验表明算法能有效地对多密度数据进行聚类,对各种数据的适应力较强,效率较优。
Because of the fixed Eps and min Pts,DBSCAN clustering algorithm is not ideal for multi-density data,and its time complexity is O( N2). Aiming at the above problems,this paper proposed a multi-density clustering algorithm DBSCAN based on region division. This algorithm used the relative grid density difference to divide the spatial data into different density regions,then generated different Eps automatically according to the different density of each region,and used DBSCAN algorithm to improve the accuracy. This idea kept DBSCAN from traversing of all data when it searched for density connected region. So it also improved the algorithm efficiency. Experiments show that the algorithm can effectively cluster the multi-density data. It has a better adaptability to various kinds of data and better efficiency.
作者
韩利钊
钱雪忠
罗靖
宋威
Han Lizhao;Qian Xuezhong;Luo Jing;Song Wei(Engineering Research Center of lnternet of Things Technology Applications for Ministry of Education,Jiangnan University,Wuxi Jiangsu 214122,China)
出处
《计算机应用研究》
CSCD
北大核心
2018年第6期1668-1671,1685,共5页
Application Research of Computers
基金
中央高校基础研究资助项目(JUSRP51510
JUSRP51635B)
关键词
区域划分
多密度
相对密度差
DBSCAN聚类
region division
muhi-density
relative density difference
DBSCAN clustering