摘要
加密流量数据包之间具有明显的时序特征,现有方法很难提取出流量数据中隐含的时序特征,未能将时序特征与空间特征有效地融合,公开数据集大都存在类间样本不平衡的问题,给加密流量的准确分类带来巨大挑战.针对上述问题,提出了一种包含时空特征提取模块和难样本学习模块的卷积神经网络模型.时空特征提取模块先利用不同维度的卷积核来同步学习流量数据包序列中的时序和空间特征,再利用自适应加权融合策略将提取到的时空特征进行有效融合;难样本学习模块使用焦点函数让模型在训练过程中更偏向对困难样本的学习,进一步均衡不同类别的分类效果.实验结果表明:上述方法在ISCX VPN-nonVPN2016数据集和USTC-TFC2016数据集上的分类准确率分别达到了99.38%和99.46%,对不同类别流量分类结果的F1评价指标分别为99.04%和99.31%,与当前同类方法相比具有更优秀的识别性能.
The data packets of encrypted traffic have obvious temporal features.The existing methods are difficult to extract the hidden temporal features in the traffic data.The temporal features and spatial features cannot be effectively integrated,and most of the open datasets have the problem of sample imbalance between classes which brings great challenges to the accurate classification of encrypted traffic.For the above problems,a Convolutional Neural Network model including a spatio-temporal feature extraction module and a hard samples learning module is proposed.Firstly convolution kernels of different dimensions in the spatio-temporal feature extraction module are used to synchronously learn the temporal and spatial features in the traffic data packet sequence.Then the extracted spatio-temporal features are effectively fused by using the adaptive weighted fusion strategy.The hard sample learning module uses the focus function to make the model more inclined to learned hard samples during the training process,which further balances the classification effects of different classes.The experimental results show that the classification accuracy rates of the above method on the ISCX VPN-nonVPN2016 dataset and the USTC-TFC2016 dataset are 99.38%and 99.46%respectively,and the F1 score for the classification results of different traffic classes are 99.04%and 99.31%,which has better recognition performance compared with the current similar methods.
作者
陈拓
石浩
李翔杰
吴能光
CHEN Tuo;SHI Hao;LI Xiangjie;WU Nengguang(Hangzhou Medical College Affiliated People′s Hospital,Zhejiang Provincial People′s Hospital,Hangzhou 314408,China;School of Information Engineering,Hangzhou Medical College,Hangzhou 311399,China;Information Center,The First Affiliated Hospital of Fujian Medical University,Fuzhou 350005,China)
出处
《中南民族大学学报(自然科学版)》
CAS
2024年第3期384-392,共9页
Journal of South-Central University for Nationalities:Natural Science Edition
基金
浙江省医药卫生科技计划资助项目(2022PY037)。
关键词
网络安全
加密流量分类
时空特征学习
融合策略
cyber security
encrypted traffic classification
spatio-temporal features learning
fusion strategy