摘要
【目的】针对协同训练算法在视图分割时未考虑噪声影响和两视图分类器对无标记样本标注不一致问题,提出了基于加权主成分分析和改进密度峰值聚类的协同训练算法。【方法】首先引入加权主成分分析对数据进行预处理,通过寻求初始有标记样本中特征和类标记之间的依赖关系求得各特征加权系数,再对加权变换后的数据进行降维并提取高贡献度特征进行视图分割,这一策略能较好地过滤视图分割时引入的噪声,同时保证数据中的关键特征能均衡划分到两个视图,从而更好地实现两个分类器的协同作用;同时,在密度峰值聚类上提出一种"双拐点"法来自动选择聚类中心,利用改进后的密度峰值聚类来确定标记不一致样本的最终类别,以降低样本被误分类的概率。【结果】与对比算法相比,所提算法在分类准确率和算法稳定性上有较大提升。【结论】通过加权主成分分析能有效地过滤掉视图分割中的噪声特征,同时改进后的密度峰值聚类减少了样本被误标记的概率。
[Purposes]In the co-training algorithm,the noise effect is not considered in view segmentation and inconsistent labeling of unlabeled samples by two view classifiers.Aimed at the above questions,a co-training algorithm based on weighted principal component analysis(WPCA)and improved density peak clustering is proposed.[Methods]Firstly,the WPCA is introduced into data preprocessing.The weighted coefficient is obtained by linear fitting the dependency between data and the class in initial labeled samples.Then,the dimension of weighted transformed data is reduced and high contribution features are extracted for view segmentation.This strategy can filter the noise in view segmentation and key features are evenly divided into two views,so it can better achieve the synergy of the two classifiers.At the same time,a"double turning point"method is proposed to automatically select the cluster center in the density peak clustering.Then,the improved density peak clustering is utilized to re-classify the samples of inconsistent label,which can decline the probability of sample misclassification.[Findings]Compared with the comparison algorithm,the proposed algorithm has better classification accuracy and algorithm stability.[Conclusions]Four experiments on 9 UCI datasets show that the proposed algorithm has a great improvement in classification accuracy and efficiency.
作者
龚旭
吕佳
GONG Xu;Lü Jia(Chongqing Center of Engineering Technology Research on Digital Agriculture Service,College of Computer and Information Sciences,Chongqing Normal University,Chongqing Normal University,Chongqing 401331,China)
出处
《重庆师范大学学报(自然科学版)》
CAS
北大核心
2021年第4期87-96,共10页
Journal of Chongqing Normal University:Natural Science
基金
国家自然科学基金(No.11971084)
重庆市教育委员会科技创新项目(No.KJCX220024)
重庆市高校创新研究群体(No.CXQT20015)
重庆市研究生科研创新项目(No.CYS20241)。