期刊文献+

基于Hausdorff距离的区间数据的系统聚类分析 被引量:8

Hierarchy Clustering Analysis of Interval Data based on Hausdorff Distance
原文传递
导出
摘要 基于Hausdorff距离用于定义两个紧集之间距离的考虑,将区间数视为一个紧集,定义了区间数之间的距离,并研究了区间向量的距离,从而得到聚类分析中两个样品间的距离。进一步定义了两个类之间的Hausdorff距离。为消除量纲对聚类结果的影响,研究了区间数据的标准化。基于此,给出了区间数据系统聚类算法。采用随机模拟的方法,对文中方法进行有效性评价,结论表明,Hausdorff距离法的聚类有效性在所有设计的实验条件下都要优于传统的欧式距离法。最后,基于符号数据分析的思想构造区间数据,给出了对多种动物群体按其身高、体重等生理特征进行聚类分析的算例。 An interval being seen as a compact set, the distance between two interval numbers is defined based on ttausdorff distance which is used to define a distance between two compact sets. Furthermore, the distance between two interval vectors and two clusters were studied. To avoid the impact of different scales of the sample data, the normalization of interval data were studied. Based on this, the hierarchy clustering algorithm of interval data was proposed. A simulation study was conducted to evaluate our method. The results show that the method based on Hausdorff distance presented in the paper performs better than on Euclidean distance under all the situations designed in the simulation. Finally, an example of clustering several types of animals according to their heights and weights is given, where the interval data were achieved by the theory of symbolic data analysis.
出处 《数理统计与管理》 CSSCI 北大核心 2014年第4期634-641,共8页 Journal of Applied Statistics and Management
基金 国家自然科学基金青年基金资助项目(70701026 71271147)
关键词 区间数 聚类分析 HAUSDORFF距离 interval data, clustering analysis, Hausdorff distance
  • 相关文献

参考文献16

  • 1汪海凤,赵英.我国国家高新区发展的因子聚类分析[J].数理统计与管理,2012,31(2):270-278. 被引量:35
  • 2吴香华,牛生杰,吴诚鸥,秦伟良.马氏距离聚类分析中协方差矩阵估算的改进[J].数理统计与管理,2011,30(2):240-245. 被引量:27
  • 3Moore R E.Interval Analysis[M].New Jersey:Prentice-Hall,Englewood Cliffs,1966.
  • 4Billard L,Diday E.From the statistics of data to the statistics of knowledge:Symbolic data analysis[J].Journal of the American Statistical Association,2003,98(462):470-487.
  • 5Billard L,Diday E.Symbolic Data Analysis:Conceptual Statistics and Data Mining[M].Chichester,UK:John Viley&Sons Ltd,England,2006:242-243.
  • 6De Carvalho PAT,Brito P,Bock H H.Dynamic clustering for interval data based on L2 distance[J].Computational Statistics,2006,21(2):231-250.
  • 7Antonio I,Rosanna V.Dynamic clustering of interval data using a Wasserstein-based distance[J].Pattern Recognition,2008,29(11):1648-1658.
  • 8Guo J,Li W,Li C,et al.Standardization of interval symbolic data based on the empirical descriptive statistics[J].Computational Statistics and Data Analysis,2012,56(3):602-610.
  • 9De Carvalho FAT,Tenorio C P.Fuzzy K-means clustering algorithms for interval-valued data based on adaptive quadratic distances[J].Fuzzy Sets and Systems,2010,161:2978-2999.
  • 10Sato-Ilic M.Symbolic clustering with interval-valued data[J].Procedia Computer Science,2011,(6):358-363.

二级参考文献65

共引文献80

同被引文献70

引证文献8

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部