期刊文献+

SPRINT算法的改进 被引量:5

Improvement of SPRINT Algorithm
在线阅读 下载PDF
导出
摘要 引出了纯区间的概念后,提出了一种基于纯区间归约的数值型属性处理方法对SPRINT算法进行改进。该方法将属性值域用等宽直方图的方法划分为多个区间,对纯区间进行归约,对非纯区间进行精确计算,保证了分裂精度,减小了计算量。 This paper introduces the concept of pure interval, proposes a new splitting method based on pure intervals reduction to deal with numeric attributes for SPRINT algorithm. The method divides the numeric attributes to many intervals with equal-width histogram, reduces the pure intervals, calculates exactly the minimum gini value in the impure intervals, ensures the accuracy of split result and reduces computation.
出处 《计算机工程》 EI CAS CSCD 北大核心 2006年第16期55-57,共3页 Computer Engineering
关键词 决策树 SPRINT算法 纯区间归约 Gini指数 Decision tree SPRINT algorithm Pure intervals reduction Gini index
  • 相关文献

参考文献8

  • 1Quinlan J R. C4.5: Programs for Machine Learning[M]. Morgan Kaufman, 1993.
  • 2Mehta M, Agrawal R, Rissancn J. SLIQ: A Fast Scalable Classifier for Data Mining[C]. Proc. of the 5^th Int'l Conf. on Extending Database Technology, Avignon, France, 1996-03.
  • 3Sharer J, Agrawal R, Mehta M. SPRINT: A Scalable Parallel Classifier for Data Mining[C]. Proc. of the 22th Int'l Conf. on VLDB, Bombay,India, 1996-09.
  • 4Alsabti K, Ranka S, Singh V. CLOUDS: A Decision Tree Classier for Large Datasets[C]. Proc. of the 4^th Int'l Conf. on Knowledge Discovery and Data Mining, 1998.
  • 5Hart J, Kamber M. Data Mining: Concepts and Techniques[M].Beijing: High Education Press, 2001: 279-301.
  • 6Wang H, Zaniolo C. CMP: A Fast Decision Tree Classier Using Multivariate Predictions[C]. Proc. of the 166 Int'l Conf. on Data Engineering, 2000.
  • 7Agrawai R, Ghosh S, Imielinski T, et al. An Interval Classifier for Database Mining Applications[C]. Proc. of the VLDB Conference.Vancouver, British Columbia, Canada, 1992-08.
  • 8Ruggieri S. Efficient C4.5[J]. IEEE Transactions on Knowledge and Data Engineering, 2002,14(2).

同被引文献56

引证文献5

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部