期刊文献+

一般分布区间型符号数据的描述统计与分析 被引量:15

Descriptive statistics and analysis of interval symbolic data with general distribution
原文传递
导出
摘要 以对大规模个体数据通过打包形成的区间型符号数据为研究对象,针对个体在区间内往往不服从均匀分布的实际情况,研究一般分布的区间型符号数据的描述统计和分析方法.对符号数据分析进行了概述,并定义了一般分布的区间变量.研究了一般分布的区间变量的经验分布函数和经验联合分布函数.在此基础上,讨论了一般分布区间变量的描述统计量的求解.最后给出了算例,运用一般分布区间型符号数据的因子分析方法.以中国股市为背景进行了应用研究.结论表明:以往研究基于均匀分布假设所给出的描述统计量的计算,可看作文中所给求解公式的特例.另外,研究方法基于经验分布理论,无需知道个体在区间内服从分布函数的具体表达式,且在计算过程中充分利用了区间内的个体信息. Interval symbolic data gained by data packaging on the original individuals of a sample are subjects of this paper. The individuals are always non-uniformly distributed within the intervals. Regarding this situation, this paper concentrates on descriptive statistics and analysis of generally distributed interval data, within which each individual is arbitrarily distributed. The basic theory of symbolic data analysis was first introduced. Then the definition of generally distributed interval was proposed. In the following, the study on empirical distribution function and empirical joint distribution function for generally distributed interval symbolic data were put forward. Based on this, the descriptive statistics of generally distributed interval variables were obtained. Finally a numerical example was given. And an application study in Chinese stock market was carried through using factor analysis of generally distributed interval symbolic data. Research shows that the previous works supposing uniform distribution are especial case of this work. Besides this, the method presented in this paper does not need the exact form of distribution function, since it is obtained upon theory of empirical distribution. Furthermore, it makes the best of the individuals sample information of the intervals.
出处 《系统工程理论与实践》 EI CSSCI CSCD 北大核心 2011年第12期2367-2372,共6页 Systems Engineering-Theory & Practice
基金 国家自然科学基金(70701026) 天津市哲学社会科学研究规划(TJGL11-099)
关键词 符号数据分析 区间数据 描述统计 一般分布 symbolic data analysis interval valued data descriptive statistics general distribution
  • 相关文献

参考文献12

  • 1Bock H H, Diday E. Analysis of Symbolic Data[M]. Berlin, New York: Springer-Verlag, 2000.
  • 2胡艳,王惠文.一种海量数据的分析技术——符号数据分析及应用[J].北京航空航天大学学报(社会科学版),2004,17(2):40-44. 被引量:19
  • 3Moore R E. Interval Analysis[M]. New Jersey, Englewood Cliffs: Prentice-Hall, 1966.
  • 4王育红,党耀国.基于灰色关联系数和D-S证据理论的区间数投资决策方法[J].系统工程理论与实践,2009,29(11):128-134. 被引量:33
  • 5Despotis D K, Derpanis D. A rain-max goal programming approach to priority derivation in AHP with interval judgements[J], International Journal of Information Technology & Decision Making, 2008, 7(1): 175-182.
  • 6Billard L, Diday E. From the statistics of data to the statistics of knowledge: Symbolic data analysis[J]. Journal of the American Statistical Association, 2003, 98(462): 470-487.
  • 7Diday E, Noirhomme-Fraiture M. Symbolic Data Analysis and the SODAS Software[M]. West Sussex, Chichester: John Viley & Sons Ltd, 2008.
  • 8郭均鹏,李汶华.基于经验相关矩阵的区间主成分分析[J].管理科学学报,2008,11(3):49-52. 被引量:10
  • 9De Carvalho F D A T, Csernel M, Lechevallier Y. Clustering constrained symbolic data[J]. Pattern Recognition Letters, 2009, 30(11): 1037-1045.
  • 10Irpino A. Spaghetti PCA analysis -- An extension of principal components analysis to time dependent interval data[J]. Pattern Recognition Letters, 2006, 27(5): 504-513.

二级参考文献25

共引文献56

同被引文献161

引证文献15

二级引证文献74

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部