Most clustering algorithms need to describe the similarity of objects by a predefined distance function. Three distance functions which are widely used in two traditional clustering algorithms k-means and hierarchical...Most clustering algorithms need to describe the similarity of objects by a predefined distance function. Three distance functions which are widely used in two traditional clustering algorithms k-means and hierarchical clustering were investigated. Both theoretical analysis and detailed experimental results were given. It is shown that a distance function greatly affects clustering results and can be used to detect the outlier of a cluster by the comparison of such different results and give the shape information of clusters. In practice situation, it is suggested to use different distance function separately, compare the clustering results and pick out the 搒wing points? And such points may leak out more information for data analysts.展开更多
A leukocyte image fast scanning method based on max min distance clustering is proposed.Because of the lower proportion and uneven distribution of leukocytes in human peripheral blood,there will not be any leukocyte i...A leukocyte image fast scanning method based on max min distance clustering is proposed.Because of the lower proportion and uneven distribution of leukocytes in human peripheral blood,there will not be any leukocyte in lager quantity of the captured images if we directly scan the blood smear along an ordinary zigzag scanning routine with high power(100^(x))objective.Due to the larger field of view of low power(10^(x))objective,the captured low power blood smear images can be used to locate leukocytes.All of the located positions make up a specific routine,if we scan the blood smear along this routine with high power objective,there will be definitely leukocytes in almost all of the captured images.Considering the number of captured images is still large and some leukocytes may be redundantly captured twice or more,a leukocyte clustering method based on max-min distance clustering is developed to reduce the total number of captured images as well as the number of redundantly captured leukocytes.This method can improve the scanning eficiency obviously.The experimental results show that the proposed method can shorten scanning time from 8.0-14.0min to 2.54.0 min while extracting 110 nonredundant individual high power leukocyte images.展开更多
Based on structural surface normal vector spherical distance and the pole stereographic projection Euclidean distance,two distance functions were established.The cluster analysis of structure surface was conducted by ...Based on structural surface normal vector spherical distance and the pole stereographic projection Euclidean distance,two distance functions were established.The cluster analysis of structure surface was conducted by the use of ATTA clustering methods based on ant colony piles,and Silhouette index was introduced to evaluate the clustering effect.The clustering analysis of the measured data of Sanshandao Gold Mine shows that ant colony ATTA-based clustering method does better than K-mean clustering analysis.Meanwhile,clustering results of ATTA method based on pole Euclidean distance and ATTA method based on normal vector spherical distance have a great consistence.The clustering results are most close to the pole isopycnic graph.It can efficiently realize grouping of structural plane and determination of the dominant structural surface direction.It is made up for the defects of subjectivity and inaccuracy in icon measurement approach and has great engineering value.展开更多
In a vehicular ad hoc network(VANET),a massive quantity of data needs to be transmitted on a large scale in shorter time durations.At the same time,vehicles exhibit high velocity,leading to more vehicle disconnections...In a vehicular ad hoc network(VANET),a massive quantity of data needs to be transmitted on a large scale in shorter time durations.At the same time,vehicles exhibit high velocity,leading to more vehicle disconnections.Both of these characteristics result in unreliable data communication in VANET.A vehicle clustering algorithm clusters the vehicles in groups employed in VANET to enhance network scalability and connection reliability.Clustering is considered one of the possible solutions for attaining effectual interaction in VANETs.But one such difficulty was reducing the cluster number under increasing transmitting nodes.This article introduces an Evolutionary Hide Objects Game Optimization based Distance Aware Clustering(EHOGO-DAC)Scheme for VANET.The major intention of the EHOGO-DAC technique is to portion the VANET into distinct sets of clusters by grouping vehicles.In addition,the DHOGO-EAC technique is mainly based on the HOGO algorithm,which is stimulated by old games,and the searching agent tries to identify hidden objects in a given space.The DHOGO-EAC technique derives a fitness function for the clustering process,including the total number of clusters and Euclidean distance.The experimental assessment of the DHOGO-EAC technique was carried out under distinct aspects.The comparison outcome stated the enhanced outcomes of the DHOGO-EAC technique compared to recent approaches.展开更多
In this paper, at first a new line-symmetry-based distance is proposed. The properties of the proposed distance are then elaborately described. Kd-tree-based nearest neighbor search is used to reduce the complexity of...In this paper, at first a new line-symmetry-based distance is proposed. The properties of the proposed distance are then elaborately described. Kd-tree-based nearest neighbor search is used to reduce the complexity of computing the proposed line-symmetry-based distance. Thereafter an evolutionary clustering technique is developed that uses the new linesymmetry-based distance measure for assigning points to different clusters. Adaptive mutation and crossover probabilities are used to accelerate the proposed clustering technique. The proposed GA with line-symmetry-distance-based (GALSD) clustering technique is able to detect any type of clusters, irrespective of their geometrical shape and overlapping nature, as long as they possess the characteristics of line symmetry. GALSD is compared with the existing well-known K-means clustering algorithm and a newly developed genetic point-symmetry-distance-based clustering technique (GAPS) for three artificial and two real-life data sets. The efficacy of the proposed line-symmetry-based distance is then shown in recognizing human face from a given image.展开更多
为应对大规模分布式光伏(photovoltaic,PV)接入引起的主动配电网电压越限问题,降低控制策略的时序复杂性,提出一种考虑节点功率储备与节点影响力(global importance of each node,GIN)的主动配电网动态集群电压控制方法。首先,通过考虑...为应对大规模分布式光伏(photovoltaic,PV)接入引起的主动配电网电压越限问题,降低控制策略的时序复杂性,提出一种考虑节点功率储备与节点影响力(global importance of each node,GIN)的主动配电网动态集群电压控制方法。首先,通过考虑系统各节点的功率储备度,定义聚类算法的电压灵敏度-功率储备度(voltage sensitivity-power reserve,VS-PR)综合电气距离量度。进而,以GIN算法改进亲和力传播(affinity propagation,AP)聚类算法,实现网络集群划分与主导节点选取。然后,建立主动配电网集群电压控制模型,并通过动态粒子群算法(dynamic particle swarm optimization,D-PSO)进行模型求解。最后,通过建立基于MATLAB 2021b平台的IEEE 33节点仿真算例对比分析,验证了所提动态集群划分与电压控制方法的正确性和有效性。展开更多
K-means聚类算法随机确定初始聚类数目,而且原始数据集中含有大量的冗余特征会导致聚类时精度降低,而布谷鸟搜索(CS)算法存在收敛速度慢和局部搜索能力弱等问题,为此提出一种基于自适应布谷鸟优化特征选择的K-means聚类算法(DCFSK)。首...K-means聚类算法随机确定初始聚类数目,而且原始数据集中含有大量的冗余特征会导致聚类时精度降低,而布谷鸟搜索(CS)算法存在收敛速度慢和局部搜索能力弱等问题,为此提出一种基于自适应布谷鸟优化特征选择的K-means聚类算法(DCFSK)。首先,为提升CS算法的搜索速度和精度,在莱维飞行阶段,设计了自适应步长因子;为调节CS算法全局搜索和局部搜索之间的平衡、加快CS算法的收敛,动态调整发现概率,进而提出改进的动态CS算法(IDCS),在IDCS的基础上构建了结合动态CS的特征选择算法(DCFS)。其次,为提升传统欧氏距离的计算精确度,设计同时考虑样本和特征对距离计算贡献程度的加权欧氏距离;为了确定最佳聚类数目的选取方法,依据改进的加权欧氏距离构造了加权簇内距离和簇间距离。最后,为克服传统K-means聚类目标函数仅考虑簇内的距离而未考虑簇间距离的缺陷,提出基于中位数的轮廓系数的目标函数,进而设计了DCFSK。实验结果表明,在10个基准测试函数上,IDCS的各项指标取得了较优的结果;相较于K-means、DBSCAN(Density-Based Spatial Clustering of Applications with Noise)等算法,在6个合成数据集与6个UCI数据集上,DCFSK的聚类效果最佳。展开更多
文摘Most clustering algorithms need to describe the similarity of objects by a predefined distance function. Three distance functions which are widely used in two traditional clustering algorithms k-means and hierarchical clustering were investigated. Both theoretical analysis and detailed experimental results were given. It is shown that a distance function greatly affects clustering results and can be used to detect the outlier of a cluster by the comparison of such different results and give the shape information of clusters. In practice situation, it is suggested to use different distance function separately, compare the clustering results and pick out the 搒wing points? And such points may leak out more information for data analysts.
文摘空间聚类是空间数据挖掘的重要手段之一。本文研究了一种基于质心点距离的Max-min distance空间聚类算法:通过加载园地图斑数据,计算其园地图斑质心,判断聚类中心之间的距离,并将符合条件的园地图斑进行聚类,最终将聚类结果可视化表达。本文的算法是利用Visual Studio 2017实验平台和ArcGIS Engine组件式开发环境,采用C#语言进行编写。实验结果表明:1)Max-mindistance聚类通过启发式的选择簇中心,克服了K-means选择簇中心过于邻近的缺点,能够适应嵩口镇等山区丘陵地区空间分布呈破碎的园地数据集分布,有效地实现园地的合理聚类;2)根据连片面积将园地空间聚类结果分为大中小三类,未来嵩口镇可以重点发展园地连片规模较大的村庄,形成规模化的青梅种植园。
基金supported by the 863 National Plan Foundation of China under Grant No.2007AA01Z333 and Special Grand National Project of China under Grant No.2009ZX02204-008.
文摘A leukocyte image fast scanning method based on max min distance clustering is proposed.Because of the lower proportion and uneven distribution of leukocytes in human peripheral blood,there will not be any leukocyte in lager quantity of the captured images if we directly scan the blood smear along an ordinary zigzag scanning routine with high power(100^(x))objective.Due to the larger field of view of low power(10^(x))objective,the captured low power blood smear images can be used to locate leukocytes.All of the located positions make up a specific routine,if we scan the blood smear along this routine with high power objective,there will be definitely leukocytes in almost all of the captured images.Considering the number of captured images is still large and some leukocytes may be redundantly captured twice or more,a leukocyte clustering method based on max-min distance clustering is developed to reduce the total number of captured images as well as the number of redundantly captured leukocytes.This method can improve the scanning eficiency obviously.The experimental results show that the proposed method can shorten scanning time from 8.0-14.0min to 2.54.0 min while extracting 110 nonredundant individual high power leukocyte images.
基金Project(41272304)supported by the National Natural Science Foundation of ChinaProject(51074177)jointly supported by the National Natural Science Foundation and Shanghai Baosteel Group Corporation,ChinaProject(CX2012B070)supported by Hunan Provincial Innovation Fund for Postgraduated Students,China
文摘Based on structural surface normal vector spherical distance and the pole stereographic projection Euclidean distance,two distance functions were established.The cluster analysis of structure surface was conducted by the use of ATTA clustering methods based on ant colony piles,and Silhouette index was introduced to evaluate the clustering effect.The clustering analysis of the measured data of Sanshandao Gold Mine shows that ant colony ATTA-based clustering method does better than K-mean clustering analysis.Meanwhile,clustering results of ATTA method based on pole Euclidean distance and ATTA method based on normal vector spherical distance have a great consistence.The clustering results are most close to the pole isopycnic graph.It can efficiently realize grouping of structural plane and determination of the dominant structural surface direction.It is made up for the defects of subjectivity and inaccuracy in icon measurement approach and has great engineering value.
基金This work was supported by the Ulsan City&Electronics and Telecommunications Research Institute(ETRI)grant funded by the Ulsan City[22AS1600,the development of intelligentization technology for the main industry for manufacturing innovation and Human-mobile-space autonomous collaboration intelligence technology development in industrial sites].
文摘In a vehicular ad hoc network(VANET),a massive quantity of data needs to be transmitted on a large scale in shorter time durations.At the same time,vehicles exhibit high velocity,leading to more vehicle disconnections.Both of these characteristics result in unreliable data communication in VANET.A vehicle clustering algorithm clusters the vehicles in groups employed in VANET to enhance network scalability and connection reliability.Clustering is considered one of the possible solutions for attaining effectual interaction in VANETs.But one such difficulty was reducing the cluster number under increasing transmitting nodes.This article introduces an Evolutionary Hide Objects Game Optimization based Distance Aware Clustering(EHOGO-DAC)Scheme for VANET.The major intention of the EHOGO-DAC technique is to portion the VANET into distinct sets of clusters by grouping vehicles.In addition,the DHOGO-EAC technique is mainly based on the HOGO algorithm,which is stimulated by old games,and the searching agent tries to identify hidden objects in a given space.The DHOGO-EAC technique derives a fitness function for the clustering process,including the total number of clusters and Euclidean distance.The experimental assessment of the DHOGO-EAC technique was carried out under distinct aspects.The comparison outcome stated the enhanced outcomes of the DHOGO-EAC technique compared to recent approaches.
文摘In this paper, at first a new line-symmetry-based distance is proposed. The properties of the proposed distance are then elaborately described. Kd-tree-based nearest neighbor search is used to reduce the complexity of computing the proposed line-symmetry-based distance. Thereafter an evolutionary clustering technique is developed that uses the new linesymmetry-based distance measure for assigning points to different clusters. Adaptive mutation and crossover probabilities are used to accelerate the proposed clustering technique. The proposed GA with line-symmetry-distance-based (GALSD) clustering technique is able to detect any type of clusters, irrespective of their geometrical shape and overlapping nature, as long as they possess the characteristics of line symmetry. GALSD is compared with the existing well-known K-means clustering algorithm and a newly developed genetic point-symmetry-distance-based clustering technique (GAPS) for three artificial and two real-life data sets. The efficacy of the proposed line-symmetry-based distance is then shown in recognizing human face from a given image.
文摘K-means聚类算法随机确定初始聚类数目,而且原始数据集中含有大量的冗余特征会导致聚类时精度降低,而布谷鸟搜索(CS)算法存在收敛速度慢和局部搜索能力弱等问题,为此提出一种基于自适应布谷鸟优化特征选择的K-means聚类算法(DCFSK)。首先,为提升CS算法的搜索速度和精度,在莱维飞行阶段,设计了自适应步长因子;为调节CS算法全局搜索和局部搜索之间的平衡、加快CS算法的收敛,动态调整发现概率,进而提出改进的动态CS算法(IDCS),在IDCS的基础上构建了结合动态CS的特征选择算法(DCFS)。其次,为提升传统欧氏距离的计算精确度,设计同时考虑样本和特征对距离计算贡献程度的加权欧氏距离;为了确定最佳聚类数目的选取方法,依据改进的加权欧氏距离构造了加权簇内距离和簇间距离。最后,为克服传统K-means聚类目标函数仅考虑簇内的距离而未考虑簇间距离的缺陷,提出基于中位数的轮廓系数的目标函数,进而设计了DCFSK。实验结果表明,在10个基准测试函数上,IDCS的各项指标取得了较优的结果;相较于K-means、DBSCAN(Density-Based Spatial Clustering of Applications with Noise)等算法,在6个合成数据集与6个UCI数据集上,DCFSK的聚类效果最佳。