摘要
对ID3算法的基本概念和原理以及其分支策略和构造过程进行了详细阐述,针对ID3算法倾向于选择取值较多的属性的缺点,引进属性偏向阈和信息增益率对其做了改进,并利用凸函数的性质简化了ID3算法中信息增益的计算.通过实验对改进前后的算法进行了比较,实验表明,改进后的算法是有效的.
The basic concepts and principles of ID3 algorithm and its branching strategy and construction process are elaborated. For the shortcoming that ID3 algorithm tends to choose attribute with many values, at- tribution deflection threshold and information gain ratio were introduced to improve ID3 algorithm, and the properties of convex function was used to simplify the information gain calculation in the ID3 algorithm. By comparing experiments, the results show that the improved algorithm is effective.
出处
《大连交通大学学报》
CAS
2015年第2期91-95,共5页
Journal of Dalian Jiaotong University
基金
辽宁省教育厅科学研究计划资助项目(L2012163)
关键词
决策树
ID3算法
凸函数
信息增益率
属性偏向阈
decision tree
ID3 algorithm
convex function
information gain ratio
attribution deflection threshold