摘要
当攻击者使用Web应用程序将恶意代码注入不同的终端用户时,就会发生跨站脚本攻击。文章针对Web应用程序使用用户输入的数据,而不对其进行验证或编码的现象,提出一种基于正则表达式匹配算法和序列最小优化算法的递归特征消除算法(SMO-RFE)。首先对数据进行预处理,采用正则表达式匹配算法,为训练集选择有代表性的特征数据集;其次利用SMO-RFE特征选择算法选择出最优特征;再次对具有攻击性的关键词进行特征排序和组合;最后总结特征关键字的出现频率以及特征值权重比例。攻击关键字出现的频率越高,漏洞存在的可能性就越大。实验验证发现,数据集通过SMO-RFE算法选择之后,SVM特征向量被检测的准确率更高,充分说明该算法能够有效地检测跨站脚本漏洞。
When the attacker uses the Web APP to inject malicious code into different end users, XSS attacks will occur. In the light of the phenomenon that Web application uses the user's input, but don't verify or encode it, this paper put forward a kind of recursive feature elimination algorithm matching algorithm and sequential minimal optimization based on regular expression(SMO-RFE). The first is the data preprocessing, using regular expression matching algorithm, choose the characteristics of representative data set for the training set; then use the SMO-RFE feature selection algorithm to select the optimal features; once again feature sort and assemble the aggressive keywords; finally summarize the occurrence frequency of feature keyword and the weight ratio of feature value. The higher the occurrence frequency of attack keywords, the greater the likelihood of vulnerabilities. Through the experiment we can find out that after the data set is selected by SMO-RFE algorithm, the accuracy of SVM feature vector to be detected is higher, and shows that the algorithm can effectively detect XSS vulnerabilities.
出处
《信息网络安全》
CSCD
2017年第10期55-62,共8页
Netinfo Security
基金
贵州省科学基金[黔科合J字[2011]2328号
黔科合LH字[2014]7634号]