一种基于主成分分析的稀疏数据模式分类隐私保护算法

A Pattern Classification Privacy Preserve Algorithm for Sparse Data Based on Primary Component Analysis

导出

摘要模式分类过程涉及到对原始训练样本的学习,容易导致用户隐私的泄露。为了避免模式分类过程中的隐私泄露,同时又不影响模式分类算法的性能,提出一种基于主成分分析(PCA)的模式分类隐私保护算法。该算法利用PCA提取原始训练数据的主成分,并将原始训练样本集合转化为主成分的新样本集合,然后利用新样本集合进行分类学习。选用Adult数据集和KDD CUP 99数据集进行仿真实验,并采用正确率和召回率进行性能评价,结果表明,该隐私保护算法通过PCA提取原始数据特征属性的主成分,可避免原始属性的泄露,同时PCA在一定程度上可实现去噪,从而使分类器的分类性能优于原始数据集的分类性能。与已有算法比较,该隐私保护算法具有更好的模式分类精度和隐私保护性能。 The pattern classification process involves the learning from the original training samples, which easily leads to privacy disclosure. In order to avoid the leaks of privacy in the pattern classification process and not to affect the performance of the algorithm, this paper proposes a pattern classification privacy preserve algorithm based on the primary component analysis （PCA）. This algorithm extracts the principal component of the original training data and converts the original training samples to new samples corresponding to the primary components. Then, a classification model is trained on the new samples. Experiments are carried out on the Adult data set and the KDD CUP 99 data set, and the precision and recall indexes are used to evaluate the proposed algorithm. It is shown that this algorithm can avoid the leakage of the original attributes through extracting the principal components of the feature attributes about the raw data. PCA can achieve de-noising to some extent, so that the classification performance on the classifier is better than that on the original data set. Therefore, compared with the existing algorithms, this algorithm has better pattern classification accuracy and privacy preserve performance.

作者原永滨杨静张健沛于旭

机构地区哈尔滨工程大学计算机科学与技术学院福州大学电气工程与自动化学院青岛科技大学信息科学与技术学院

出处《科技导报》 CAS CSCD 北大核心 2014年第12期68-73,共6页 Science & Technology Review

基金国家自然科学基金项目(61370083 61073043 61073041) 高等学校博士学科点专项科研基金(20112304110011 20122304110012)

关键词主成分分析模式分类隐私保护算法 primary component analysis pattern classification privacy preserve algorithms

分类号 TP309 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献20

1Han J,Kamber M,Pei J.Data mining:Concepts and techniques[M].CA,San Mateo:Morgan kaufmann,2006.
2Sweeney L.k-anonymity:A model for protecting privacy[J].International Journal of Uncertainty,Fuzziness and Knowledge-Based Systems,2002,10(5):557-570.
3Maehanavajjhala A,Kifer D,Gehrke J,et al.L-diversity:Privacy beyond k-anonymity[J].ACM Transactions on Knowledge Discovery from Data,2007(1):3.
4田秀霞,王晓玲,高明,周傲英.数据库服务——安全与隐私保护[J].软件学报,2010,21(5):991-1006. 被引量：62
5Yang J,Yu X,Xie Z Q,et al.A novel virtual sample generation method based on Gaussian distribution[J].Knowledge-Based Systems,2011,24(6):740-748.
6戴群,陈松灿,王喆.一个基于自组织特征映射网络的混合神经网络结构(英文)[J].软件学报,2009,20(5):1329-1336. 被引量：4
7杨静,辛宇,谢志强.面向物联网传感器事件监测的双向反馈系统[J].计算机学报,2013,36(3):506-520. 被引量：19
8Cortes C,Vapnik V.Support-vector networks[J].Machine learning,1995,20(3):273-297.
9曾志强,高济.基于向量集约简的精简支持向量机[J].软件学报,2007,18(11):2719-2727. 被引量：16
10顾彬,郑关胜,王建东.增量和减量式标准支持向量机的分析[J].软件学报,2013,24(7):1601-1613. 被引量：31

二级参考文献125

1Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. In: Proc. of the Parrallel Distributed Processing: Explorations in the Microstructure of Cognition 1-Foundations. 1986. 318-362.
2Weijters A. The BP-SOM architecture and learning rule. Neural Processing Letters, 1995,2(6):13-16.
3Kohonen T. Self-Organisation and Associative Memory. Berlin, Springer-Verlag, 1989.
4Ridella S, Rovetta S, Zunino R. Circular back-propagation networks embed vector quantization. IEEE Trans. and Neural Networks, 1999,10(4) :972-975.
5Dai Q, Chen SC, Zhang BZ. Improved CBP neural network model with applications in time series prediction. Neural Processing Letters, 2003,18:197-211.
6Dai Q, Chen SC. Chained DLS-ICBP neural networks with multiple steps time series prediction. Neural Processing Letters, 2005, 21(2):95-107.
7Chen SC, Dai Q. DLS-ICBP neural networks with applications in time series prediction. Neural Computing & Application, 2005,14: 250-255.
8Chen SC, Dai Q. Integrating the improved CBP model with kernel SOM. Neurocomputing, 2006,69(16-18):2208-2216.
9Simon H, Neural Networks. A Comprehensive Foundation. Prentice-Hall, Inc., 1999.
10Weijters A, van den Bosch, van den Herik HJ. Intelligible Neural Networks with BP-SOM. Marcke and Daelemans, 1997.27-36.

共引文献140

1林志贵,房伟,黄伟志.基于支持向量机的织物悬垂性能评估分析[J].纺织学报,2009,30(1):51-54. 被引量：6
2吴海珍,何伟,蒋加伏,齐琦.基于蚁群智能和支持向量机的图像分割方法[J].计算机工程与设计,2009,30(2):408-410. 被引量：5
3皋军,王士同,邓赵红.广义的势支撑特征选择方法GPSFM[J].计算机研究与发展,2009,46(1):41-51. 被引量：6
4刘培胜,贾银山,韩云萍.一种改进的简化支持向量机[J].辽宁石油化工大学学报,2009,29(1):76-78. 被引量：4
5曾志强,高济,朱顺痣.基于约简SVM的网络入侵检测模型[J].计算机工程,2009,35(17):132-134. 被引量：7
6朱方,顾军华,杨欣伟,杨瑞霞.基于相似性分析的SVM快速分类算法[J].计算机工程,2010,36(19):174-176. 被引量：3
7张坤,李庆忠,史玉良.面向SaaS应用的数据组合隐私保护机制研究[J].计算机学报,2010,33(11):2044-2054. 被引量：35
8麻浩,王晓明.外包数据库的安全访问控制机制[J].计算机工程,2011,37(9):173-175. 被引量：2
9阮慧,党德鹏.基于RBF模糊神经网络的信息安全风险评估[J].计算机工程与设计,2011,32(6):2113-2115. 被引量：16
10兰睿欣,钟郁娟.基于MES的信息系统设计优化技术研究[J].制造业自动化,2011,33(23):11-12. 被引量：3

1李艺,徐晓梅,韩存兵.软件脆弱性影响模式研究初步[J].计算机应用与软件,2004,21(6):98-100.
2陈善雄,李娅,余建桥.基于贝叶斯学习的兴趣评估方法[J].农业网络信息,2005(4):14-16.
3邓京璟,叶晓俊.基于R树多维K-匿名算法[J].计算机工程,2008,34(1):80-82. 被引量：4
4Sadettin Keskin.Communication and Management of Knowledge in Research and Development （R＆D） Networks[J].Journal of US-China Public Administration,2013,10(2):233-239.
5“博客”挑战传统媒体[J].支部生活（山东）,2006(6):50-51.
6李新明,李艺,刘东.软件脆弱性影响分析模型[J].计算机工程,2010,36(17):63-65. 被引量：1
7李广水,郑滔,孙梅.基于分形维的决策树构建及应用研究[J].计算机技术与发展,2009,19(12):5-8. 被引量：2
8张友能,王德兵,汪伟.Parzen窗核密度估计的模式分类隐私保护方法[J].淮南师范学院学报,2014,16(5):93-96.
9原永滨,杨静,张健沛,于旭.Parzen窗核密度估计的大规模数据模式分类隐私保护方法[J].科技导报,2014,32(36):104-109. 被引量：2
10杨月平,王箭.基于k-匿名的多源数据融合算法研究[J].计算机技术与发展,2017,27(5):102-107. 被引量：4

科技导报

2014年第12期

浏览历史

内容加载中请稍等...

一种基于主成分分析的稀疏数据模式分类隐私保护算法

参考文献20

二级参考文献125

共引文献140

相关作者

相关机构

相关主题

浏览历史