摘要
在面对复杂任务时,传统强化学习方法存在状态空间庞大、奖励函数稀疏等问题,导致机械臂不能学习到复杂的操作技能。针对上述问题,提出一种基于分层强化学习的机械臂复杂操作技能学习方法。首先,底层运用基于Beta过程的自回归隐马尔可夫模型,将复杂操作任务分解为多个简单的子任务;其次,对每个子任务运用SAC算法进行技能学习,得到每个子任务的最优策略;最后,根据底层得到的子任务最优策略,上层通过基于最大熵目标的改进强化学习算法学习复杂操作技能。实验结果表明,所提方法能有效实现机械臂复杂操作技能的学习、再现与泛化,并在性能上优于其他传统强化学习算法。
The traditional reinforcement learning methods face challenges such as large state space and sparse reward functions when dealing with complex tasks,which hinders the learning of complex manipulation skills for robot manipulator.Therefore,a complex manipulation skill learning approach based on hierarchical reinforcement learning for robot manipulator is proposed.Firstly,the autoregressive hidden Markov model(HMM)based on Beta process is used for the low⁃level to decompose complex manipulation tasks into several simple subtasks.Secondly,the SAC(soft actor critic)algorithm is used to learn skills and obtain the optimal strategy for each subtask.Finally,on the basis of the optimal strategy obtained at the low⁃level,an improved reinforcement learning algorithm based on maximum entropy objective is utilized at the high⁃level to learn complex manipulation skills.Experimental results demonstrate that the proposed method can effectively achieve learning,reproduction and generalization of complex manipulation skills for robot manipulator,and outperform other traditional reinforcement learning algorithms in terms of the performance.
作者
孟子晗
高翔
刘元归
马陈昊
MENG Zihan;GAO Xiang;LIU Yuangui;MA Chenhao(College of Automation&College of Artificial Intelligence,Nanjing University of Posts and Telecomunications,Nanjing 210023,China)
出处
《现代电子技术》
2023年第19期116-124,共9页
Modern Electronics Technique
基金
江苏省自然科学基金项目(BK20210599)
江苏省博士后科研资助项目(2019K030)。
关键词
机械臂
复杂操作任务
分层强化学习
子目标
自回归隐马尔可夫模型
SAC算法
robot manipulator
complex manipulation task
hierarchical reinforcement learning
sub⁃objective
autoregressive HMM
SAC algorithm