摘要
决策树是数据挖掘任务中分类的常用方法。在构造决策树的过程中,节点划分属性选择的标准直接影响决策树分类的效果。基于粗糙集的属性频率函数等方法度量属性重要性的概念,将其用于分枝划分属性的选择,提出一种决策树学习算法。该方法仅利用区分矩阵就可以计算出属性的出现频率函数值,计算简单。实验结果表明,用该方法构造的决策树与传统的基于信息熵方法构造的决策树相比较,结构简单,且能有效提高分类效果。
Decision tree is a usual method of classification in data mining. In the process of the decision tree construeting, the criteria of selecting partition attributes will influence the efficiency of classification. Based on the concept of attributes importance metric that is measured by the function of attribute frequency in Rough Set theory, and the metric being used to select the partition attribute, a new decision tree algorithm is proposed.In the algorithm, the function of attribute frequency is computed only using the discernibility matrix of data set. So, the computation is simple. The results of experiment show that compared with the entropy-based method, the decision tree constructed by the new algorithm is simpler in the structure, and the new algorithm can improve the efficiency of classification.
出处
《广西工学院学报》
CAS
2007年第4期1-4,共4页
Journal of Guangxi University of Technology
基金
广西自然科学基金项目(桂科自0481016)
广西教育厅2006年科研基金项目(149)
广西工学院博士基金项目
关键词
决策树
粗糙集
属性重要性
属性频率
decision tree
sough set
attribute importance
attribute frequency