摘要
为了解决多波束低地球轨道(LEO)卫星波束间同频干扰、频谱短缺、业务量分布不均等问题,针对单一决策网络缺乏自我修正能力、容易陷入局部最优解、无法充分考虑长期影响等弊端,提出了一种基于决策性能评估的资源分配算法。该算法引入不同用户的业务满足指数来衡量系统的公平性,在考虑公平性的前提下优化系统的吞吐量性能,并将该优化问题建模为多目标优化问题。将具有时间相关性的连续资源分配过程建模为马尔可夫过程,提出基于决策性能评估的网络资源分配算法来解决该问题。所提算法可以根据评估网络的评估结果调整决策网络参数,从而优化资源分配方案,同时更新评估网络自身参数。通过迭代优化的方式,实现决策网络的准确预测。仿真结果表明,所提算法在吞吐量性能和公平性方面优于传统资源分配算法。
To address challenges such as co-frequency interference,spectrum scarcity,and uneven traffic distribution in multi-beam LEO satellites,a resource allocation algorithm based on decision performance evaluation was proposed.The system fairness was measured by a user satisfaction index and the system throughput was optimized while considering fairness.The optimization problem was modeled as a multi-objective optimization.The continuous resource allocation process with temporal correlation was modeled as a Markov decision process,and a decision-evaluation dual-network algorithm was proposed to solve it.The decision network parameters were adjusted based on evaluation network results to optimize resource allocation and update the evaluation network parameters.Through iterative optimization,the decision network achieved accurate predictions.Simulation results show that the proposed algorithm outperforms traditional resource allocation algorithms in terms of throughput and fairness.
作者
王朝炜
庞明亮
王粟
赵玲莉
高飞飞
崔高峰
王卫东
WANG Chaowei;PANG Mingliang;WANG Su;ZHAO Lingli;GAO Feifei;CUI Gaofeng;WANG Weidong(School of Electronic Engineering,Beijing University of Posts and Telecommunications,Beijing 100875,China;China Mobile Communications Corporation,Beijing 100032,China;Department of Automation,Tsinghua University,Beijing 100084,China)
出处
《通信学报》
EI
CSCD
北大核心
2024年第7期37-47,共11页
Journal on Communications
基金
重庆市自然科学基金资助项目(No.CSTB2023NSCQ-LZX0118)
北京邮电大学博士生创新基金资助项目(No.CX2023139)。
关键词
多波束卫星
深度强化学习
多目标优化
资源管理
multi-beam satellite
deep reinforcement learning
multi-objective optimization
resource management