期刊文献+

基于非均衡数据集的代价敏感学习算法比较研究 被引量:30

A Comparative Study of Cost-Sensitive Learning Algorithm Based on Imbalanced Data Sets
在线阅读 下载PDF
导出
摘要 大多数非均衡数据集的研究集中于重构数据集或者代价敏感学习,针对数据集类分布非均衡和不相等误分类代价往往同时发生这一事实,在简要回顾代价敏感学习理论和现有学习算法的基础上,将所提出的自适应混合重取样算法,与基于最小误分类代价的MetaCost算法分别进行实验比较,实验表明所提出算法在代价敏感学习中具有一定的优势,实验结果显示非均衡类对代价敏感学习算法性能产生较大影响,当样本类别差异较大时,用样本类空间重构的方法可以得到较好的分类效果. Most studies on the imbalanced data set classification focused on discussion of re-sampling or cost-sensitive learning systems themselves,however,the fact that imbalanced class distribution and misclassification errors cost unequally always occurring simultaneously was neglected.On the basis of analyzing the theory and algorithm of cost-sensitive learning,a novel hybrid re-sampling technique based on Automated Adaptive Selection of the Number of Nearest Neighbors in order to solve the misclassification problem of imbalanced data set is proposed.We compared hybrid re-sampling algorithm and MetaCost algorithm,Experiment results show that the proposed method can improve the classification accuracy and decrease the misclassification cost effectively.The experimental results confirm that this algorithm is superior to the traditional algorithms as for dealing with the imbalanced problem.
出处 《微电子学与计算机》 CSCD 北大核心 2011年第8期146-149,153,共5页 Microelectronics & Computer
基金 国家自然科学基金项目(61075063) 国家高技术研究发展计划("八六三"计划)项目(2009AA12Z117) 湖北省自然科学基金项目(2010CDB05201) 湖北省教育厅中青年项目(Q20112604)
关键词 分类 非均衡数据集 混合重取样 代价敏感学习 classification imbalanced dataset Hybrid Re-sampling Cost Sensitive Learning
  • 相关文献

参考文献8

  • 1凌晓峰,SHENG Victor S..代价敏感分类器的比较研究(英文)[J].计算机学报,2007,30(8):1203-1212. 被引量:35
  • 2赵会,黄景涛,谈书才.最小二乘支持向量机的一种非均衡数据分类算法[J].微电子学与计算机,2010,27(4):33-37. 被引量:3
  • 3Friedman J H, Olshen R A, Stone C J, et al. Classifica- tion and regression trees[M]. American Statistical Asso- ciation: The Film House, 1986.
  • 4Elkan (2. The foundations of cost- sensitive learning [C]//Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI' 01). Wash- ington DC, 2001 : 973-978.
  • 5Ciraco M, Rogalewski M, Weiss G. Improving classifier utility by altering the misclassification cost ratio[C]//the 1st International Workshop on Utility-based Data Mining. New York, 2005 : 46-52.
  • 6Fan W, Stofol S, Zhang J X. Ada cost: misclassification cost--sensitive boosting[C]//Proc of the 16th lnt' 1 Conf on Machine Lming. Slovenia: Bled, 1999 : 97-105.
  • 7Maloof M. Learning when data sets are imbalanced and when costs are unequal and unknown[C]// Working Notes of the ICML'03 Workshop on Learning from Im- balanced Data Sets. Washingtzon, DC. 2003.
  • 8The Center for Machine Learning and Intelligent Systems. UC irvine machine learning repository[DB/OL]. [1989-01-01]. http://archive, ics. uci. edu/ml/dataset: html.

二级参考文献34

  • 1徐勋华,王继成.支撑向量机的多类分类方法[J].微电子学与计算机,2004,21(10):149-152. 被引量:27
  • 2孔锐,张冰.一种快速支持向量机增量学习算法[J].控制与决策,2005,20(10):1129-1132. 被引量:31
  • 3王海峰,胡德金.最小二乘支持向量机的一种稀疏化算法[J].计算机工程与应用,2005,41(33):68-70. 被引量:11
  • 4刘爽,贾传荧,陈鹏.一种自动选择参数的加权支持向量机算法[J].计算机工程与应用,2006,42(2):64-66. 被引量:9
  • 5Suykens J A K, Vandewalle J. Least squares support vector machine classifiers [ J ]. Neural Processing Letter, 1999, 9(3) :293 - 300.
  • 6Wu G, Chang E Y. Class- boundary alignment for imbalanced dataset learning [ C ]//Workshop on Learning from Imbalanced Datasets ( ICML 2003 ). Washington, 19(2, 2003 : 49 - 56.
  • 7Pelckrnans K, Suykens J A K, Van T Gestel. et al. LSSVMlabl. 5Toolbox[ EB/OL]. (2003- 10- 11)[2009- 07 - 21]. http://www, esat. kuleuven. ac. be/sista/lssvmlab.
  • 8UCL Machine Learning Group. Elena database[EB/OL] [2009 - 07 - 15]. http://www. dice. ucl. ac. be/neural - nets/Research/Projects/ELENA/elena.htm 44 stuff.
  • 9VAPNIK V N.统计学习理论[M].许建华,张学工,译.北京:电子工业出版社,2004.
  • 10Turney P D.Types of cost in inductive concept learning//Proceedings of the Workshop on Cost-Sensitive Learning at the Seventeenth International Conference on Machine Learning.Stanford University,California,2000:15-21

共引文献36

同被引文献219

引证文献30

二级引证文献137

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部