针对现有基于深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法的再入制导方法计算精度较差,对强扰动条件适应性不足等问题,在DDPG算法训练框架的基础上,提出一种基于长短期记忆-DDPG(long short term memory-DDPG,LST...针对现有基于深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法的再入制导方法计算精度较差,对强扰动条件适应性不足等问题,在DDPG算法训练框架的基础上,提出一种基于长短期记忆-DDPG(long short term memory-DDPG,LSTM-DDPG)的再入制导方法。该方法采用纵、侧向制导解耦设计思想,在纵向制导方面,首先针对再入制导问题构建强化学习所需的状态、动作空间;其次,确定决策点和制导周期内的指令计算策略,并设计考虑综合性能的奖励函数;然后,引入LSTM网络构建强化学习训练网络,进而通过在线更新策略提升算法的多任务适用性;侧向制导则采用基于横程误差的动态倾侧反转方法,获得倾侧角符号。以美国超音速通用飞行器(common aero vehicle-hypersonic,CAV-H)再入滑翔为例进行仿真,结果表明:与传统数值预测-校正方法相比,所提制导方法具有相当的终端精度和更高的计算效率优势;与现有基于DDPG算法的再入制导方法相比,所提制导方法具有相当的计算效率以及更高的终端精度和鲁棒性。展开更多
To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on...To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.展开更多
准确的高铁沿线风速预测是铁路灾害预警系统的基础需求,为了提升应对和处理强风灾害致突发事件的能力,提出一种基于减法平均优化(subtraction average based optimizer,SABO)算法优化长短时记忆(long short-term memory,LSTM)神经网络...准确的高铁沿线风速预测是铁路灾害预警系统的基础需求,为了提升应对和处理强风灾害致突发事件的能力,提出一种基于减法平均优化(subtraction average based optimizer,SABO)算法优化长短时记忆(long short-term memory,LSTM)神经网络的高铁沿线短期风速预测方法。首先,针对风速非线性和非平稳特性,采用极小化极大(min-max,MM)方法对风速数据进行归一化处理;其次,采用SABO算法中的“-v”方法对LSTM模型的关键参数搜索寻优,并构建风速预测模型;最后,以中国宝兰高铁沿线风速采集点采集的实测风速数据为例,对模型进行有效性检验。实验结果表明:SABO算法的寻优效果更加良好,预测精度更高,所建模型的平均绝对误差(mean absolute error,MAE)、平均绝对百分比误差(mean absolute percentage error,MAPE)和均方根误差(route mean square error,RMSE)分别仅为11.96%、1.23%和16.47%,决定系数(r-square,R^(2))为0.995。与其他模型相比,通过SABO算法优化后的LSTM神经网络在短期风速预测上具有较好的拟合效果和更高的预测精度,可为高铁沿线大风预测预警提供一种新的方法和思路。展开更多
传统传输方法受到网络配置及策略影响,限制了远程桌面协议端口、数据库端口等数据的传输,导致异常数据辨识的准确性较低。为此引进长短期记忆(Long Short Term Memory,LSTM)算法,以国产麒麟系统为例,开展网络异常数据辨识方法的设计。...传统传输方法受到网络配置及策略影响,限制了远程桌面协议端口、数据库端口等数据的传输,导致异常数据辨识的准确性较低。为此引进长短期记忆(Long Short Term Memory,LSTM)算法,以国产麒麟系统为例,开展网络异常数据辨识方法的设计。引入网络异常数据变化程度系数,建立网络异常数据的特征分布函数以此量化异常数据的特征,计算国产麒麟系统网络异常节点权重。将节点权重作为输入,利用LSTM算法对时序数据进行学习,从而识别系统异常节点特征,并得到识别结果。结合异常节点特征,计算国产麒麟系统网络异常数据的综合特征值,综合运用异常数据的状态空间以及与之相关的测量值和信息熵,输出最具有代表性的异常数据。基于此,实现对网络传输节点异常数据的辨识定位。对比实验结果表明,设计的方法不仅可以提高传输数据异常辨识的时效性,还可以精准划分正常数据与异常数据。展开更多
文摘针对现有基于深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法的再入制导方法计算精度较差,对强扰动条件适应性不足等问题,在DDPG算法训练框架的基础上,提出一种基于长短期记忆-DDPG(long short term memory-DDPG,LSTM-DDPG)的再入制导方法。该方法采用纵、侧向制导解耦设计思想,在纵向制导方面,首先针对再入制导问题构建强化学习所需的状态、动作空间;其次,确定决策点和制导周期内的指令计算策略,并设计考虑综合性能的奖励函数;然后,引入LSTM网络构建强化学习训练网络,进而通过在线更新策略提升算法的多任务适用性;侧向制导则采用基于横程误差的动态倾侧反转方法,获得倾侧角符号。以美国超音速通用飞行器(common aero vehicle-hypersonic,CAV-H)再入滑翔为例进行仿真,结果表明:与传统数值预测-校正方法相比,所提制导方法具有相当的终端精度和更高的计算效率优势;与现有基于DDPG算法的再入制导方法相比,所提制导方法具有相当的计算效率以及更高的终端精度和鲁棒性。
基金supported by the Natural Science Basic Research Prog ram of Shaanxi(2022JQ-593)。
文摘To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.
文摘准确的高铁沿线风速预测是铁路灾害预警系统的基础需求,为了提升应对和处理强风灾害致突发事件的能力,提出一种基于减法平均优化(subtraction average based optimizer,SABO)算法优化长短时记忆(long short-term memory,LSTM)神经网络的高铁沿线短期风速预测方法。首先,针对风速非线性和非平稳特性,采用极小化极大(min-max,MM)方法对风速数据进行归一化处理;其次,采用SABO算法中的“-v”方法对LSTM模型的关键参数搜索寻优,并构建风速预测模型;最后,以中国宝兰高铁沿线风速采集点采集的实测风速数据为例,对模型进行有效性检验。实验结果表明:SABO算法的寻优效果更加良好,预测精度更高,所建模型的平均绝对误差(mean absolute error,MAE)、平均绝对百分比误差(mean absolute percentage error,MAPE)和均方根误差(route mean square error,RMSE)分别仅为11.96%、1.23%和16.47%,决定系数(r-square,R^(2))为0.995。与其他模型相比,通过SABO算法优化后的LSTM神经网络在短期风速预测上具有较好的拟合效果和更高的预测精度,可为高铁沿线大风预测预警提供一种新的方法和思路。
文摘传统传输方法受到网络配置及策略影响,限制了远程桌面协议端口、数据库端口等数据的传输,导致异常数据辨识的准确性较低。为此引进长短期记忆(Long Short Term Memory,LSTM)算法,以国产麒麟系统为例,开展网络异常数据辨识方法的设计。引入网络异常数据变化程度系数,建立网络异常数据的特征分布函数以此量化异常数据的特征,计算国产麒麟系统网络异常节点权重。将节点权重作为输入,利用LSTM算法对时序数据进行学习,从而识别系统异常节点特征,并得到识别结果。结合异常节点特征,计算国产麒麟系统网络异常数据的综合特征值,综合运用异常数据的状态空间以及与之相关的测量值和信息熵,输出最具有代表性的异常数据。基于此,实现对网络传输节点异常数据的辨识定位。对比实验结果表明,设计的方法不仅可以提高传输数据异常辨识的时效性,还可以精准划分正常数据与异常数据。