摘要
航天器规避机动过程中面临多种复杂约束条件,传统基于数值优化的动作规划方法在处理相应模型和约束条件时存在初值敏感、计算时间较长等问题,难以对近距离轨道威胁做出及时反应.针对该问题,本文提出一种基于深度强化学习的航天器多约束规避动作规划方法.建立航天器六自由度非线性动力学模型以及相应姿轨机动约束条件;建立基于双延迟深度确定性策略梯度(TD3)的动作规划方法,通过TD3训练得到的神经网络在线生成满足多种约束条件的规避机动动作;构造与规划方法相适配的深度强化学习规范化训练环境,确保学习训练过程中智能体和环境的有效交互.仿真结果表明,所提方法能在预期交会时间仅数十秒的情况下快速实时生成规避动作,规划周期小于9 ms,远低于作为对比项的高斯伪谱法.
Spacecrafts face with multiple complex constraints during avoidance maneuvers.There are several problems in the traditional motion planning methods based on numerical optimization when processing corresponding models and constraints,such as the sensitive initial value and long calculation time,which makes it difficult to deal with close-range orbital threats in time.To address this problem,a multi-constrained avoidance motion planning method based on deep reinforcement learning(DRL)is proposed in this paper.First,the spacecraft six-degree-of-freedom nonlinear dynamical model and related constraints for attitude-orbit maneuvers are established.Then,the avoidance motion planning method based on twin delayed deep deterministic policy gradient(TD3)is proposed,and the multi-constrained avoidance maneuvering actions can be online generated via the neural networks trained by TD3.Finally,the normative DRL training environment matched with the proposed planning method is constructed to ensure the effective interactions between agents and environments.Simulation results show that the proposed method can rapidly generate avoidance actions in real time when the expected rendezvous time is only in tens of seconds,and the planning period is less than 9 ms,which is much lower than the Gauss pseudo-spectral method as a comparison item.
作者
吴健发
魏春岭
张海博
李克行
郝仁剑
WU Jianfa;WEI Chunling;ZHANG Haibo;LI Kehang;HAO Renjian(Beijing Institute of Control Engineering,Beijing 100094,China;Science and Technology on Space Intelligent Control Laboratory,Beijing 100094,China)
出处
《空间控制技术与应用》
CSCD
北大核心
2023年第2期1-9,共9页
Aerospace Control and Application
基金
国家自然科学基金(62203046、U21B6001)
航天领域基金(2022-JCJQ-JJ-0660)
空间智能控制技术重点实验室基金(2022-JCJQ-LB-010-01)
中国航天科技集团有限公司钱学森青年创新基金
中国航天科技集团有限公司自主研发项目
中国博士后科学基金(2022M713006)。
关键词
规避机动
轨道威胁
动作规划
深度强化学习
avoidance maneuver
orbital threat
motion planning
deep reinforcement learning