摘要
针对一般强化学习方法下机器人在避障路径规划上学习时间长、探索能力差和奖励稀疏等问题,提出了一种基于改进深度Q网络(DQN)的移动机器人避障路径规划。首先在传统DQN算法基础上设计了障碍学习规则,避免对同一障碍重复学习,提升学习效率和成功率。其次提出奖励优化方法,利用状态间的访问次数差异给予奖励,平衡状态点的访问次数,避免过度访问;同时通过计算与目标点的欧氏距离,使其偏向于选择接近目标的路径,并取消远离目标惩罚,实现奖励机制的自适应优化。最后设计了动态探索因子函数,在后期训练中侧重利用强化学习策略选取动作和学习,提高算法性能和学习效率。实验仿真结果显示,与传统DQN算法相比,改进算法在训练时间上缩短了40.25%,避障成功率上提升了79.8%以及路径长度上缩短了2.25%,均体现了更好的性能。
Aiming at the problems such as long learning time,poor exploration ability and sparse reward in obstacle avoidance path planning for robots under general reinforcement learning methods,an obstacle avoidance path planning for mobile robots based on improved Deep Q network(DQN)was proposed.Firstly,based on the traditional DQN algorithm,the obstacle learning rules are designed to remember and avoid obstacles,avoid repeated learning of the same obstacle,and improve the learning efficiency and success rate.Secondly,a reward optimization method is proposed,which uses the difference of access times between states to give rewards,balances the access times of state points,and avoids excessive access.At the same time,by calculating the Euclidean distance from the target point,it is inclined to choose the path close to the target,and cancel the penalty of staying away from the target,and realize the adaptive optimization of the reward mechanism.Finally,the dynamic exploration factor function is designed,and the reinforcement learning strategy is used to select action and learning in the later training to improve the performance and learning efficiency of the algorithm.The experimental simulation results show that compared with the traditional DQN algorithm,the improved algorithm can shorten the training time by 40.25%,the obstacle avoidance success rate by 79.8%and the path length by 2.25%,all of which show better performance.
作者
田箫源
董秀成
TIAN Xiaoyuan;DONG Xiucheng(School of Electrical Engineering and Electronic Information,Xihua University,Chengdu 610000,China;Sichuan University Jinjiang College,Meishan 620860,China)
出处
《中国惯性技术学报》
EI
CSCD
北大核心
2024年第4期406-416,共11页
Journal of Chinese Inertial Technology
基金
国家自然科学基金(11872069)
四川省中央引导地方科技发展专项(2021ZYD0034)。
关键词
移动机器人
DQN算法
路径规划
避障
深度强化学习
mobile robot
DQN algorithm
path planning
obstacle avoidance
deep reinforcement learning