摘要
针对传统的数据流检测中存在的时间复杂度高、准确度低等问题,提出了一种基于滑动时间窗口和k-距离剪枝的信息熵异常检测算法。该算法引用滑动时间窗口将动态的数据流静态化,当数据流填满当前窗口后,在当前窗口中用k-距离剪枝方法对数据进行初步检测,从而剔除绝大部分的正常数据。最后再对筛选出疑似异常的数据用信息熵的检测方法进行检测,输出信息熵值大于设定阈值EA的数据点。通过实验验证,该算法比传统的检测算法在时间复杂度和准确度上都有一定的优越性。
Aiming at the inaccuracy and high time complexity of traditional data stream mining technology, this paper introduced a new algorithm of date detection which based on k-distance to pruning and comentropy to detect in the sliding windows. This algorithm used the sliding windows to static dynamic data. When the data filled the current window, it used k-distance of the data to prune all the data in the preliminary testing. Then it fihered out the most of the normal data. At last it used comentropy to detect the remaining data which may be abnormal, output the data points whose eomentropy was greater than the set threshold EA. The results of the experiments show that SWKC algorithm possess the better efficiency and accuracy than other some traditional detection algorithms.
出处
《计算机应用研究》
CSCD
北大核心
2015年第12期3579-3581,共3页
Application Research of Computers
基金
国家"十二五"科技支撑计划资助项目(2012BAF12B14)
贵州省重大科技专项基金资助项目(黔科合重大专项字(2012)6018)
贵州省工业攻关项目(黔科合GY字(2013)3020)
关键词
数据流
滑动窗口
k-距离
异常检测
信息熵
stream data
sliding window
k-distance
anomaly detection
comentropy