期刊文献+

基于多层MapReduce的混合网络流量分类特征选择方法 被引量:1

Hybrid network traffic classification feature selection method based on multilayer MapReduce
在线阅读 下载PDF
导出
摘要 针对传统的特征选择方法只适用于小规模数据集、运行效率低的缺陷,结合Filter方法和Wrapper方法的特点,提出一种基于多层MapReduce的混合网络流量分类特征选择方法。该方法通过Fisher score对数据进行预处理,剔除部分无关特征,实现高维数据的降维。采用序列前向搜索的搜索策略,通过多层MapReduce实现不断选取分类能力最好的特征。实验结果表明,该方法既保持较高的分类精度,又减少特征选择时间,实现较好的加速比,提高了网络流量分类的执行效率。 The traditional feature selection method is only suitable for small scale datasets and the operating efficiency is low, combining the feature of Filter and Wrapper, a hybrid network traffic classification feature selection method based on multilayer MapReduce is proposed. In this method, Fisher score is used to preprocess the data, the part of unrelated feature is removed and the dimensionality is reduced. Then seg, uential forward search strategy is adopted, and the best feature for classi fication is selected constantly by multilayer MapReduce. The experimental results show that this method can not only keep the high classification accuracy, but also reduce the feature selection time. Meanwhile, it can get a nice speedup ratio and increase the efficiency of network traffic classification.
出处 《桂林电子科技大学学报》 2016年第2期123-128,共6页 Journal of Guilin University of Electronic Technology
基金 国家自然科学基金(61163058 61363006) 广西可信软件重点实验室开放基金(KX201306)
关键词 特征选择 FISHER SCORE SFS MAPREDUCE feature selection Fisher score SFS MapReduce
  • 相关文献

参考文献16

  • 1姚旭,王晓丹,张玉玺,权文.特征选择方法综述[J].控制与决策,2012,27(2):161-166. 被引量:210
  • 2DEAN J,GHEMAWAT S. Mapreduce: simplified data processing on large clusters[J]. Communications of lhe ACM,2008,51(1) :107-113.
  • 3张振海,李士宁,李志刚,陈昊.一类基于信息熵的多标签特征选择算法[J].计算机研究与发展,2013,50(6):1177-1184. 被引量:62
  • 4BISHOP C M. Neural networks for pattern recognition [M]. Oxford University Press, 1995.
  • 5黄莉莉,汤进,孙登第,罗斌.基于多标签ReliefF的特征选择算法[J].计算机应用,2012,32(10):2888-2890. 被引量:37
  • 6RODRIGUES D,PEREIRA I. A M,NAKAMURA R Y M,et al. A wrapper approach for feature selection based on bat algorithm and optimum-path forest [J]. Expert Systems with Applications, 2014,41 (5) : 2250-2258.
  • 7CHUANG L Y,YANG C H,LI J C,et al. A hybrid BP SO-CGA approach for gene selection and classification of microarray dala[J]. Journal of Computational Biolo gy,2012,19(1) :68-82.
  • 8PENG Yonghong, WU Zhiqing, J IANG Jianmin. A novel feature selection approach for biomedical data classifica tion [J]. Journal of Biomedical Informatics, 2010,43: 15-23.
  • 9亓慧,王文剑,郭虎升.一种基于特征选择的SVM Bagging集成方法[J].小型微型计算机系统,2014,35(11):2533-2537. 被引量:9
  • 10ZHOU Liuhong,LIU Yanhua,CHEN Guolong. A fea ture selection algorithm to intrusion detection based on cloud model and multi-objective particle swarm optimi-zation [C]//2011 Fourth International Symposium on Computational Intelligence and Design. New Jersey: IEEE Press,2011 : 182-185.

二级参考文献89

  • 1Li G-Z, Yang J Y. Feature selection for ensemble learning and its application[M]. Machine Learning in Bioinformatics, 2008: 135-155.
  • 2Sheinvald J, Byron Dom, Wayne Niblack. A modelling approach to feature selection[J]. Proc of 10th Int Conf on Pattern Recognition, 1990, 6(1): 535-539.
  • 3Cardie C. Using decision trees to improve case-based learning[C]. Proc of 10th Int Conf on Machine Learning. Amherst, 1993: 25-32.
  • 4Modrzejewski M. Feature selection using rough sets theory[C]. Proc of the European Conf on Machine ,Learning. 1993: 213-226.
  • 5Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data[J]. J of Bioinformatics and Computational Biology, 2005, 3(2): 185-205.
  • 6Francois Fleuret. Fast binary feature selection with conditional mutual information[J]. J of Machine Learning Research, 2004, 5(10): 1531-1555.
  • 7Kwak N, Choi C-H. Input feature selection by mutual information based on Parzen window[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2002, 24(12): 1667-1671.
  • 8Novovicova J, Petr S, Michal H, et al. Conditional mutual information based feature selection for classification task[C]. Proc of the 12th Iberoamericann Congress on Pattern Recognition. Valparaiso, 2007: 417-426.
  • 9Qu G, Hariri S, Yousif M. A new dependency and correlation analysis for features[J]. IEEE Trans on Knowledge and Data Engineering, 2005, 17(9): 1199- 1207.
  • 10Forman G. An extensive empirical study of feature selection metrics for text classification[J]. J of Machine Learning Research, 2003, 3(11): 1289-1305.

共引文献303

同被引文献8

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部