摘要
引出了纯区间的概念后,提出了一种基于纯区间归约的数值型属性处理方法对SPRINT算法进行改进。该方法将属性值域用等宽直方图的方法划分为多个区间,对纯区间进行归约,对非纯区间进行精确计算,保证了分裂精度,减小了计算量。
This paper introduces the concept of pure interval, proposes a new splitting method based on pure intervals reduction to deal with numeric attributes for SPRINT algorithm. The method divides the numeric attributes to many intervals with equal-width histogram, reduces the pure intervals, calculates exactly the minimum gini value in the impure intervals, ensures the accuracy of split result and reduces computation.
出处
《计算机工程》
EI
CAS
CSCD
北大核心
2006年第16期55-57,共3页
Computer Engineering