期刊文献+

基于课程强化学习的无人机反坦克策略训练模型

UAV Anti-tank Policy Training Model Based on Curriculum Reinforcement Learning
在线阅读 下载PDF
导出
摘要 智能化时代,陆战场的争夺从平面制陆向立体制陆拓展,无人机反坦克作战在未来智能化战争制陆权的争夺中起着至关重要的作用。针对深度强化学习方法在复杂问题求解中面临的决策空间爆炸、奖励稀疏等问题,提出了一种基于VDN的动态多智能体课程学习方法。该方法在多智能体深度强化学习的训练过程中加入课程学习方法,结合Stein变分梯度下降算法改善课程学习的学习过程,解决了强化学习在复杂任务中初始训练效果差、训练时间长和收敛难的问题,并在多智能体粒子环境和无人机反坦克作战场景中分别构建了课程学习模型,实现了模型与训练先验知识从易到难的迁移。实验结果表明,通过课程学习DyMA-CL机制对强化学习训练过程进行改善,强化学习智能体在进行困难任务学习时能够获得更好的初始训练效果和更快的模型收敛速度,从而得到更好的最终效果。 In the intelligent era,the battle for land battlefield expands from planar land control to vertical land control.UAV antitank operation plays a crucial role in the battle for land control in future intelligent war.Deep reinforcement learning method in complex problem solving are faced with problems such as decision space explosion and sparse reward,this paper puts forward a dynamic multi-agent curriculum learning method based on VDN,the curriculum learning method is added into the training process of multi-agent deep reinforcement learning in this method,and combined with Stein variational gradient descent algorithm to improve the curriculum learning process.The problems of poor initial training effect,long training time and difficult convergence of reinforcement learning in complex tasks are solved.In addition,the curriculum learning model is constructed in the multi-agent particle environment and UAV anti-tank combat scene respectively,and the transfer of the model and training prior knowledge from easy to difficult is realized.Experimental results show that the curriculum learning DyMA-CL mechanism can improve the reinforcement learning training process,and the reinforcement learning agent can obtain better initial training effect,model convergence speed and final effect when conducting difficult task learning.
作者 林泽阳 赖俊 陈希亮 王军 LIN Zeyang;LAI Jun;CHEN Xiliang;WANG Jun(College of Command and Control Engineering,Army Engineering University,Nanjing 210007,China)
出处 《计算机科学》 CSCD 北大核心 2023年第10期214-222,共9页 Computer Science
基金 国家自然科学基金(61806221)。
关键词 深度强化学习 课程学习 Stein变分梯度下降 无人机 反坦克 Deep reinforcement learning Curriculum learning Stein variational gradient descent Unmanned aerial vehicle Anti-tank
  • 相关文献

参考文献3

二级参考文献12

共引文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部