摘要
针对现有自动入侵响应决策自适应性差的问题,文章提出一种基于Q-Learning的自动入侵响应决策方法——Q-AIRD。Q-AIRD基于攻击图对网络攻防中的状态和动作进行形式化描述,通过引入攻击模式层识别不同能力的攻击者,从而做出有针对性的响应动作;针对入侵响应的特点,采用Softmax算法并通过引入安全阈值θ、稳定奖励因子μ和惩罚因子ν进行响应策略的选取;基于投票机制实现对策略的多响应目的评估,满足多响应目的的需求,在此基础上设计了基于Q-Learning的自动入侵响应决策算法。仿真实验表明,Q-AIRD具有很好的自适应性,能够实现及时、有效的入侵响应决策。
Aiming at the problem of poor adaptability of existing automatic intrusion response decision-making,this paper proposes an automatic intrusion response decision-making method based on Q-Learning(Q-AIRD).Q-AIRD formalizes the states and actions of network attack and defense based on the attack graph,and introduces the attack mode layer to identify attackers with different abilities,so as to make more targeted response actions.According to the characteristics of intrusion response,the Softmax algorithm is adopted and the security thresholdθ,stable reward factorμand penalty factorνare introduced to select the response strategy.Based on the voting mechanism,the multi-response purpose evaluation of the strategy is realized to meet the needs of the multi-response purpose.On this basis,an automatic intrusion response decision algorithm based on Q-Learning is designed.The simulation results show that Q-AIRD has good adaptability and can realize timely and effective intrusion response decision-making.
作者
刘璟
张玉臣
张红旗
LIU Jing;ZHANG Yuchen;ZHANG Hongqi(Department of Cryptogram Engineering,Information Engineering University of PLA,Zhengzhou 450001,China)
出处
《信息网络安全》
CSCD
北大核心
2021年第6期26-35,共10页
Netinfo Security
基金
国家重点研发计划[2016YFF0204002,2016YFF0204003]
国家自然科学基金[61902427,61471344]。
关键词
强化学习
自动入侵响应
Softmax算法
多目标决策
reinforcement learning
automatic intrusion response
Softmax algorithm
multi-objective decision-making