摘要
针对在数据挖掘过程中存在的数据冗余特征和维灾难问题,依据Relief F算法和主成分分析算法的理论基础方法,建立了基于Relief F优化的核主成成分析的二次特征选择法,并给出了该方法的实验结果 .该方法能够有效处理维度过高、具有冗余和无关特征的数据,结合机器学习算法,使数据挖掘系统得到准确高效的执行结果,为决策人员提供有力的决策依据。通过实验得出该算法具有更高的分类准确度的结论 .
Aiming at data redundancy and curse of dimensionality in data mining process, in accordance with the theoretical bases and methods of ReliefF algorithm and principal component analysis algorithm, this paper establishes the quadratic feature selection method on the basis of ReliefF optimization and principal component analysis, and gives out the experimental results of this method. This method can effectively process the data with high dimension, redundant and irrelevant features. Combined with machine learning algorithm, it makes the data mining system get an accurate and efficient implementation result, thus providing a solid decision-making foundation for decision makers. The conclusion is that this algorithm has a higher classification accuracy obtained through experiment.
出处
《哈尔滨理工大学学报》
CAS
北大核心
2016年第1期106-109,共4页
Journal of Harbin University of Science and Technology
基金
黑龙江省博士后资助项目(LBH-Q11081)
黑龙江省教育厅科学技术研究项目(11551093)
关键词
数据挖掘
特征选择
主成分分析
data mining
feature selection
principal component analysis