摘要
在特征属性相对于类属性统计独立时,朴素贝叶斯能达到最优的分类效率。但该条件独立性假设在许多现实问题中并不成立,这将在某种程度上影响预测准确度。这里结合主成分分析(PCA)对原始数据进行预处理,消除噪声并使数据分布具有一定程度的独立特性。在U CI数据集上分别从独立性和预测准确度方面进行了验证,取得了良好的效果。
Naive Bayes is known to be optimal if predictive attributes are independent given the class.But the conditional independence assumption is rarely valid in practical learning problems and when violated,the classification performance may be affected to some extent.The principal component analysis (PCA) is used to make data set have some independence characteristics and remove noise from data.Experimental results on a variety of UCI data sets suggest great improvement from the viewpoint of prediction accuracy and independence assumption,respectively.
出处
《仪器仪表学报》
EI
CAS
CSCD
北大核心
2004年第z3期384-386,共3页
Chinese Journal of Scientific Instrument
关键词
模式识别
朴素贝叶斯
条件独立性假设
PCA
Pattern recognition Naive Bayes Conditional independence assumption PCA