摘要
针对可探测新颖类别的框架将数据流分成固定大小的数据块,导致新颖类别探测的准确率较低和处理速率较慢,且均假定数据对象所有属性具有相同的权重不符合实际情况的问题,提出一种在概念漂移数据流中探测新颖类别的分类算法(DNCS)。该算法通过周期检测滑动窗口中的数据分布,依据其变化动态调整数据块大小,以此更新分类模型,以适应新的数据变化。该算法框架使用基于属性权重的聚类算法作为探测新颖类别的基本步骤。实验结果表明,该算法具有更高的新颖类别探测精度和处理速率。
The most existing frameworks of novel class detection have low novel class detection accuracy and slow processing rate for dividing the data stream into fixed-size chunks, and it is not realistic that all the attributes o{ data objects have the same weight in the existing framework, a classification algorithm for novel class detection based on data stream with con- cept-drift(DNCS) is proposed. The algorithm periodically detects the data distribution in the sliding window, dynamically changes the size of the chunk and updates the model to adapt to the novel data. The improved algorithm makes the clustering algorithm based on attribute weight the basic step for detecting novel class. The experimental results show that DNCS has higher novel class detection accuracy and processing speed.
出处
《桂林电子科技大学学报》
2015年第6期459-465,共7页
Journal of Guilin University of Electronic Technology
基金
广西自然科学基金(2014GXNSFAA118395)
广西教育厅科研项目(2013YB094)
广西可信软件重点实验室基金(KX201116)
桂林电子科技大学研究生教育创新计划(GDYCSZ201466)
关键词
数据流
集成分类器
概念漂移
新颖类别探测
data stream
ensemble classifier
concept-drift
novel class detection