摘要
面向大规模特征集的字符串匹配技术在病毒检测、内容过滤等问题上的应用愈加广泛,而短模式串一直是阻碍性能提升的重要瓶颈。针对短模式串进行分析讨论,基于跳跃算法优化,采用了动态块大小和动态Hash处理以及Hash函数设计场景化的策略,同时探讨了多核处理器与多线程设计之间的关系。实验数据证明改进的算法策略具有支撑百万级特征集字符串匹配的能力。
Large-scale pattern set for the string matching technology in virus detection, content filtering and other applica-tions become widespread increasingly, while the short pattern has been a major bottleneck impeding performance improve-ments. Based on optimization of the jump algorithm, short pattern strings are analyzed and discussed, and strategies of dynamic block size, dynamic Hash processing and the scene of the Hash function design are applied, besides, the relationship between multi-core processors and multi-threading design is explored. Experimental data prove that the improved algo-rithm is available to support one million size pattern set for string matching.
出处
《计算机工程与应用》
CSCD
2014年第1期105-110,129,共7页
Computer Engineering and Applications
基金
北京市教育委员会科技计划面上项目(No.KM201110772014)