摘要
针对粗糙集理论只能处理离散数据的局限,提出了基于决策的剥离式连续属性离散化方法,一改传统的候选断点集合的获取方法,直接通过分析连续属性在各决策类的取值范围和计算属性重要度,完成对连续属性的初步离散。此外,本文提出候选断点集的推移原则,可逐步减小候选断点集的范围。由于每次都是针对尚不能明确分类的样本进行细化,因此随着候选断点集的减少和明确分类样本的增加,系统能够迅速收敛,并且离散化后的决策表总是相容的,这与目前很多离散方法不考虑决策相容性相比,能够最大限度地保留系统的有用信息。本文提出的离散化方法是领域独立的,不需要领域知识,可应用于不同领域的连续属性的离散化。
Proposed a new algorithm of discretization of consecutive attributes based on the decision according to the limitation that Rough Set Theory can only deal with the discrete attributes in database. Unlike traditional methods, the initial candidate points are obtained by analyzing the distribution ranges of consecutive attributes in each decision sort and computing their attribution significances. At the same time, proposed some rules of decreasing candidate points in order to increasee the velocity of system convergence. Using the algorithm, the decision table after discretization will be always consistent and can reserve useful information as much as possible. Finally, the algorithm is field-independent and can be used in different fields without any additional information.
出处
《计算机科学》
CSCD
北大核心
2007年第8期208-210,共3页
Computer Science
基金
国家863高技术研究发展计划项目(编号:2003AA114020)
关键词
粗糙集理论
属性离散化
候选断点
决策相容性
Rough set theory, Attribute discretization, Candidate point, Decision consistency