摘要
特征选择作为模式识别领域的研究热点,是一种重要的降维方法.对于连续型特征,目前主要采用离散化方法或特征分类能力的"相关性"评估进行特征选择.引入区间数相似度的概念,提出一种连续型特征选择方法.该方法以区间数相似度为基础,定义每个特征的属性相似度,以此作为特征选择的启发信息,对特征全集进行排序,选择特征子集,实现特征选择.相关实验表明了该方法的有效性.
Feature selection is a common method of dimension reduction in pattern recognition. For continuous features,feature selection mainly has two methods: discretization and relevance assessment of the features classification ability. a method of feature selection for continuous feature is realized by introducing the concept of similarity degrees of interval number. This method redefines the concept of feature similarity base on the similarity degrees of interval number as heuristic information on feature ranking,to achieve feature selection. The experiments on the UCI repository data sets have demonstrated that the approach of the feature ranking and feature selection has greatly improved the effectiveness and efficiency of classifications on continuous features.
出处
《渤海大学学报(自然科学版)》
CAS
2014年第4期350-355,共6页
Journal of Bohai University:Natural Science Edition
基金
国家自然科学基金(No:60473125)
关键词
特征选择
区间数
属性相似度
连续型特征
feature selection
attribute similarity
continuous features
interval numbers