摘要
采用电推力器实现自主轨道转移是全电推进卫星领域的关键技术之一。针对地球同步轨道(geostationary orbit,GEO)全电推进卫星的轨道提升问题,将广义优势估计(generalized advantage estimator,GAE)和近端策略优化(proximal policy optimization,PPO)方法相结合,在考虑多种轨道摄动影响以及地球阴影约束的情况下,提出了基于强化学习的时间最优小推力变轨策略优化方法。针对状态空间过大、奖励稀疏导致训练困难这一关键问题,提出了动作输出映射和分层奖励等训练加速方法,有效提升了训练效率,加快了收敛速度。数值仿真和结果对比表明,所提方法更加简单、灵活、高效,与传统的直接法、间接法以及反馈控制法相比,能够保证轨道转移时间的最优性。
Using electric thrusters for autonomous orbit transfer is one of the critical technologies in the field of all-electric propulsion satellites.In order to solve the orbit raising problem of all-electric propulsion geostationary orbit(GEO)satellites,a reinforcement learning-based optimization method for the time-optimal low-thrust orbit transfer strategy is formulated by combining generalized advantage estimator(GAE)and proximal policy optimization(PPO)methods,taking into account the influence of multiple orbital perturbations and the constraints of the earth’s shadow.Aiming at the key problem of training difficulty caused by too large state space and sparse reward,training acceleration methods such as action output mapping and hierarchical reward are proposed,which effectively improve the training efficiency and accelerate the convergence speed.Through numerical simulation and comparison of the results with the direct method,the indirect method and the feedback control method,it shows that the optimization method based on reinforcement learning is more simple,flexible,efficient,and time-optimal in orbit transfer.
作者
韩明仁
王玉峰
HAN Mingren;WANG Yufeng(Beijing Institute of Control Engineering, Beijing 100094, China;Science and Technology on Space Intelligent Control Laboratory, Beijing 100094, China)
出处
《系统工程与电子技术》
EI
CSCD
北大核心
2022年第5期1652-1661,共10页
Systems Engineering and Electronics
基金
国家自然科学基金(11502017)资助课题。
关键词
全电推进卫星
小推力变轨优化
强化学习
近端策略优化
训练加速方法
all-electric propulsion satellite
low-thrust orbit transfer optimization
reinforcement learning
proximal policy optimization(PPO)
training acceleration method