一种基于EM和分类损失的半监督主动DBN学习算法被引量：2

Semi-supervised Active DBN Learning Algorithm Based on EM and Classification Loss

在线阅读下载PDF

导出

摘要对于建立动态贝叶斯网络(DBN)分类模型时,带有类标注样本数据集获得困难的问题,提出一种基于EM和分类损失的半监督主动DBN学习算法.半监督学习中的EM算法可以有效利用未标注样本数据来学习DBN分类模型,但是由于迭代过程中易于加入错误的样本分类信息而影响模型的准确性.基于分类损失的主动学习借鉴到EM学习中,可以自主选择有用的未标注样本来请求用户标注,当把这些样本加入训练集后能够最大程度减少模型对未标注样本分类的不确定性.实验表明,该算法能够显著提高DBN学习器的效率和性能,并快速收敛于预定的分类精度. A semi-supervised active DBN learning algorithm based on EM and classification loss is set forth for building Dynamic Bayesian Networks （DBN） classifier when it is difficult to get sufficient labeled training data. Although the EM algorithm of semi-supervised learning can use unlabeled examples to learn DBN, it often suffers from adding incorrect class information which affect classifier＇s accuracy. The classification loss method of active learning combined with EM results in maximal reduction of the uncertainty of classifying unlabeled examples through actively selecting useful unlabeled examples to label and adding them to training data. Experimental results show that the proposed algorithm can improve the efficiency and accuracy of DBN learner and can achieve expected classification accuracy quickly.

作者赵悦穆志纯李霞丽潘秀琴

机构地区中央民族大学数学与计算机学院北京科技大学信息工程学院

出处《小型微型计算机系统》 CSCD 北大核心 2007年第4期656-660,共5页 Journal of Chinese Computer Systems

基金中央民族大学青年教师科研基金项目北京市教委重点学科共建项目.

关键词动态贝叶斯网络半监督学习主动学习 EM算法 dynamic bayesian networks semi-supervised learning active learning expectation-maximization algorithm

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献19

1Firedman Nir,Murphy Kevin,Russell Strart.Learning the structure of dynamic probabilistic networks[C].In:Morgan Kaufmann.Proceedings of Uncertainty in Artificial Intelligence (UAI-98),1998.139-147.
2Tian Feng-zhan,Lu Yu-chang.Building macroeconomic system models with DBNs[J].Journal of Tsinghua University (Sci&Tech),2004,44(9):1256-1259.
3王飞,刘大有,卢奕南,虞强源.基于遗传算法的动态Bayesian网结构学习的研究[J].电子学报,2003,31(5):698-702. 被引量：8
4李庆中,苑春法,黄锦辉.基于小规模标注语料的机器学习方法研究[J].计算机应用,2004,24(2):56-58. 被引量：7
5Tong Simon,Chang Edward.Support vector machine active learning for image retrieval[C].Proceedings of 9th ACM Multimedia Conference,2001.107-118.
6宫秀军,孙建平,史忠植.主动贝叶斯网络分类器[J].计算机研究与发展,2002,39(5):574-579. 被引量：37
7Roy Nicholas,McCallum Andrew.Toward optimal active learning through sampling setimation of error reduction[C].The 18th Int'l Conf on Machine Learning (ICML-2001),2001,411-448.
8Lewis David D,Gale William A.A sequential algorithm for training text classifiers[C].In:Springer Verlag.Proceedings of 17th ACM International Conference on Research and Development in Information Retrieval,1994,3-12.
9Argamon-Engleson Shlomo,Dagan Ido.Committee-based sample selection for probabilistic classifiers[J].Journal of Artificial Intelligence Research,1999,11:335-460.
10Kothari Ravi,Jain Vivek.Learning from labeled and unlabeled data using a minimal number of queries[J].IEEE Transaction on Neural Networks,2003,14(6):1496-1505.

二级参考文献39

1阎平儿张长水.人工神经网络与模拟进化计算[M].北京：清华大学出版社,2000年..
2[1]Mark Lauer, How Much is Enough? Data Requirements for Statistical NLP[J/OL]. arXiv: cmp lg/9509001.
3[2]Nigam K, McCallum AK, Thrun S, e al. Text classification from labeled and unlabeled documents using EM[J]. Machine Learning, 2000, 39(2/3):103-134.
4[3]Blum A, Mitchell T. Combining labeled and unlabeled data with co training[A]. Proceedings of the 11th COLT[C], 1998.92-100.
5[4]Collins M, Singer Y. Unsupervised models for named entity classification[A]. Proceedings of the 1999 Joint SIGDAT Conference on Empirical methods in NLP and Very Large Corpora[C]. College Park, MD,1999.90-99.
6[5]Freund Y, Schapire RE. Experiments with a new boosting algorithm[A]. machine Learning: Proceedings of the Thirteenth International Conference[C], 1996. 148-156.
7[6]Yarowsky D. Unsupervised word sense disambiguation rivaling supervised methods[A]. Proceedings of the 33rd Annual meeting of the Association for Computational Linguistics[C]. 1995. 189-196.
8[7]Abney, Steven, Bootstrapping[A]. Proceedings of 40th Annualmeeting of the Association for Computational Linguistics(ACL 2002)[C]. Philadelphia, 2002.
9[8]Nigam K, Ghani R. Analyzing the effectiveness and applicability of co training[A]. Proc. Of Ninth International Conference on Information and Knowledge management(CIKM)[C], 2000b.
10[9]Cohn D, Atlas L, Ladner R. Improving generalization with active learning[J].Machine Learning, 1994,15(2), 201-221.

共引文献51

1王利民,李雄飞,张海龙.基于广义信息论的贝叶斯分类器动态建模[J].吉林大学学报（工学版）,2009,39(3):776-780. 被引量：5
2李笛,胡学钢,胡春玲.主动贝叶斯分类方法研究[J].计算机研究与发展,2007,44(z2):47-51. 被引量：1
3李仪,蔡自兴.基于贝叶斯分类器的移动机器人避障[J].控制工程,2004,11(4):332-334. 被引量：4
4黄光球,贾颖峰,周静.基于贝叶斯-神经网络的动态回归建模与预测[J].系统仿真学报,2005,17(12):2904-2907. 被引量：6
5刘丽珍,宋瀚涛,陆玉昌.无标记训练样本的Web文本分类方法[J].计算机科学,2006,33(3):200-201. 被引量：2
6谷峰,吴扬扬.文本分类关键技术[J].福建电脑,2006,22(9):5-6. 被引量：2
7赵悦,穆志纯.基于委员会投票选择方法的主动学习的研究[J].太原理工大学学报,2006,37(4):469-472. 被引量：7
8黄光球,孙周军,刘兆明.基于贝叶斯置信网的日志服务系统容侵方法研究[J].微电子学与计算机,2006,23(12):53-57. 被引量：1
9赵悦,穆志纯.基于QBC的主动学习研究及其应用[J].计算机工程,2006,32(24):23-25. 被引量：5
10赵悦,穆志纯,董洁,付冬梅,何伟.基于QBC主动学习方法建立电信客户信用风险等级评估模型[J].北京科技大学学报,2007,29(4):442-446. 被引量：2

同被引文献12

1马忠宝,刘冠蓉.基于支持向量机的中文文本分类模型研究[J].计算机技术与发展,2006,16(11):70-72. 被引量：5
2许高建.基于Web的文本挖掘技术研究[J].计算机技术与发展,2007,17(6):187-190. 被引量：19
3Rocchio J. ( 1971 ). Relevant feedback in information retrieval [C]//In Salton G. The smart retrieval system- experiments in automatic document processing. Englewood Cliffs, NJ : [s. n. ], 1971.
4McCallum A, Nigam K. A comparison of event models for naive bayes text classification[ C]//In Proc. AAAI - 98 Workshop on Learning for Text Categorization. [ s. l. ]: AAAI Press, 1998: 41 - 48.
5Guyon l,Bcaer B E,Nips V V. Automatic capacity tuning of verylarge Vc - dimension classifiers [ J ]. Advances in Neural Information Processing Systems, 1992 ( 5 ) : 147 - 155.
6Nigam K, McCallum A, Thrun S, et al. Learning to classify text from labeled and unlabeled documents[ C]//AAAI - 98. Madison, US: AAAI Press, 1988: 792 - 799.
7张博锋,白冰,苏金树.基于自训练EM算法的半监督文本分类[J].国防科技大学学报,2007,29(6):65-69. 被引量：17
8佟国峰,李勇,丁伟利,岳晓阳.遥感影像变化检测算法综述[J].中国图象图形学报,2015,20(12):1561-1571. 被引量：66
9佃袁勇,方圣辉,姚崇怀.多尺度分割的高分辨率遥感影像变化检测[J].遥感学报,2016,20(1):129-137. 被引量：63
10张德园,常云翔,张利国,石祥滨.SAT-CNN:基于卷积神经网络的遥感图像分类算法[J].小型微型计算机系统,2018,39(4):859-864. 被引量：18

引证文献2

1范新,沈闻,丁泉勋,沈洁.基于正例和未标文档的半监督分类研究[J].计算机技术与发展,2009,19(6):58-60.
2胡蕾,江宇,李进,张永梅.一种多尺度稀疏卷积的高分辨率遥感图像变化检测方法[J].小型微型计算机系统,2020,41(11):2365-2370. 被引量：4

二级引证文献4

1刘伟权,王程,臧彧,胡倩,于尚书,赖柏锜.基于遥感大数据的信息提取技术综述[J].大数据,2022,8(2):28-57. 被引量：9
2韩星,韩玲,李良志,李慧慧.基于深度学习的高分辨率遥感图像建筑物变化检测[J].激光与光电子学进展,2022,59(10):45-53. 被引量：14
3郭继峰,孙文博,庞志奇,费禹潇,白淼源.一种改进YOLOv4的交通标志识别算法[J].小型微型计算机系统,2022,43(7):1471-1476. 被引量：15
4牛雅睿,武一,孙昆,卢昊,赵普.基于轻量级卷积神经网络的手势识别检测[J].电子测量技术,2022,45(4):91-98. 被引量：9

1宫秀军,孙建平,史忠植.主动贝叶斯网络分类器[J].计算机研究与发展,2002,39(5):574-579. 被引量：37
2张燕平,邹慧锦,赵姝.基于CCA的代价敏感三支决策模型[J].南京大学学报（自然科学版）,2015,51(2):447-452. 被引量：11
3李秋洁,茅耀斌,王执铨.基于Boosting的不平衡数据分类算法研究[J].计算机科学,2011,38(12):224-228. 被引量：17
4余承依.基于贝叶斯增量分类的邮件过滤研究[J].科学技术与工程,2009,9(9):2356-2361. 被引量：1
5金广智,石林锁,牟伟杰,刘浩,司海峰.基于偏最小二乘特征表示与分类的联合优化目标跟踪[J].光电子．激光,2016,27(2):203-209. 被引量：1
6金广智,石林锁,刘浩,牟伟杰,蔡艳平.结合PLS表示与随机梯度的目标优化跟踪[J].电子与信息学报,2016,38(8):2027-2032.
7李秋洁,赵亚琴,顾洲.代价敏感学习中的损失函数设计[J].控制理论与应用,2015,32(5):689-694. 被引量：15
8王刚,张燕平,陈洁,赵姝.基于K最近邻的代价敏感三支决策边界域处理模型[J].数码设计,2016,5(2):15-20. 被引量：1
9李笛,胡学钢,胡春玲.主动贝叶斯分类方法研究[J].计算机研究与发展,2007,44(z2):47-51. 被引量：1
10姜卯生,王浩,姚宏亮.朴素贝叶斯分类器增量学习序列算法研究[J].计算机工程与应用,2004,40(14):57-59. 被引量：10

小型微型计算机系统

2007年第4期

浏览历史

内容加载中请稍等...

一种基于EM和分类损失的半监督主动DBN学习算法被引量：2

参考文献19

二级参考文献39

共引文献51

同被引文献12

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

一种基于EM和分类损失的半监督主动DBN学习算法 被引量：2

参考文献19

二级参考文献39

共引文献51

同被引文献12

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

一种基于EM和分类损失的半监督主动DBN学习算法被引量：2