The distributed permutation flow shop scheduling problem(DPFSP)has received increasing attention in recent years.The iterated greedy algorithm(IGA)serves as a powerful optimizer for addressing such a problem because o...The distributed permutation flow shop scheduling problem(DPFSP)has received increasing attention in recent years.The iterated greedy algorithm(IGA)serves as a powerful optimizer for addressing such a problem because of its straightforward,single-solution evolution framework.However,a potential draw-back of IGA is the lack of utilization of historical information,which could lead to an imbalance between exploration and exploitation,especially in large-scale DPFSPs.As a consequence,this paper develops an IGA with memory and learning mechanisms(MLIGA)to efficiently solve the DPFSP targeted at the mini-malmakespan.InMLIGA,we incorporate a memory mechanism to make a more informed selection of the initial solution at each stage of the search,by extending,reconstructing,and reinforcing the information from previous solutions.In addition,we design a twolayer cooperative reinforcement learning approach to intelligently determine the key parameters of IGA and the operations of the memory mechanism.Meanwhile,to ensure that the experience generated by each perturbation operator is fully learned and to reduce the prior parameters of MLIGA,a probability curve-based acceptance criterion is proposed by combining a cube root function with custom rules.At last,a discrete adaptive learning rate is employed to enhance the stability of the memory and learningmechanisms.Complete ablation experiments are utilized to verify the effectiveness of the memory mechanism,and the results show that this mechanism is capable of improving the performance of IGA to a large extent.Furthermore,through comparative experiments involving MLIGA and five state-of-the-art algorithms on 720 benchmarks,we have discovered that MLI-GA demonstrates significant potential for solving large-scale DPFSPs.This indicates that MLIGA is well-suited for real-world distributed flow shop scheduling.展开更多
For a class of non-uniform output sampling hybrid system with actuator faults and bounded disturbances,an iterative learning fault diagnosis algorithm is proposed.Firstly,in order to measure the impact of fault on sys...For a class of non-uniform output sampling hybrid system with actuator faults and bounded disturbances,an iterative learning fault diagnosis algorithm is proposed.Firstly,in order to measure the impact of fault on system between every consecutive output sampling instants,the actual fault function is transformed to obtain an equivalent fault model by using the integral mean value theorem,then the non-uniform sampling hybrid system is converted to continuous systems with timevarying delay based on the output delay method.Afterwards,an observer-based fault diagnosis filter with virtual fault is designed to estimate the equivalent fault,and the iterative learning regulation algorithm is chosen to update the virtual fault repeatedly to make it approximate the actual equivalent fault after some iterative learning trials,so the algorithm can detect and estimate the system faults adaptively.Simulation results of an electro-mechanical control system model with different types of faults illustrate the feasibility and effectiveness of this algorithm.展开更多
For unachievable tracking problems, where the system output cannot precisely track a given reference, achieving the best possible approximation for the reference trajectory becomes the objective. This study aims to in...For unachievable tracking problems, where the system output cannot precisely track a given reference, achieving the best possible approximation for the reference trajectory becomes the objective. This study aims to investigate solutions using the Ptype learning control scheme. Initially, we demonstrate the necessity of gradient information for achieving the best approximation.Subsequently, we propose an input-output-driven learning gain design to handle the imprecise gradients of a class of uncertain systems. However, it is discovered that the desired performance may not be attainable when faced with incomplete information.To address this issue, an extended iterative learning control scheme is introduced. In this scheme, the tracking errors are modified through output data sampling, which incorporates lowmemory footprints and offers flexibility in learning gain design.The input sequence is shown to converge towards the desired input, resulting in an output that is closest to the given reference in the least square sense. Numerical simulations are provided to validate the theoretical findings.展开更多
为提升哈里斯鹰优化算法收敛精度,解决易陷入局部最优等问题,提出了一种基于迭代混沌精英反向学习和黄金正弦策略的哈里斯鹰优化算法(gold sine HHO,GSHHO)。利用无限迭代混沌映射初始化种群,运用精英反向学习策略筛选优质种群,提高种...为提升哈里斯鹰优化算法收敛精度,解决易陷入局部最优等问题,提出了一种基于迭代混沌精英反向学习和黄金正弦策略的哈里斯鹰优化算法(gold sine HHO,GSHHO)。利用无限迭代混沌映射初始化种群,运用精英反向学习策略筛选优质种群,提高种群质量,增强算法的全局搜索能力;使用一种收敛因子调整策略重新计算猎物能量,平衡算法的全局探索和局部开发能力;在哈里斯鹰的开发阶段引入黄金正弦策略,替换原有的位置更新方法,提升算法的局部开发能力;在9个测试函数和不同规模的栅格地图上评估GSHHO的有效性。实验结果表明:GSHHO在不同测试函数中具有较好的寻优精度和稳定性能,在2次机器人路径规划中路径长度较原始HHO算法分别减少4.4%、3.17%,稳定性分别提升52.98%、63.12%。展开更多
In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinfor...In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural networkbased reinforcement learning, thereby potentially leading to more effective policy improvement.展开更多
Group scheduling problems have attracted much attention owing to their many practical applications.This work proposes a new bi-objective serial-batch group scheduling problem considering the constraints of sequence-de...Group scheduling problems have attracted much attention owing to their many practical applications.This work proposes a new bi-objective serial-batch group scheduling problem considering the constraints of sequence-dependent setup time,release time,and due time.It is originated from an important industrial process,i.e.,wire rod and bar rolling process in steel production systems.Two objective functions,i.e.,the number of late jobs and total setup time,are minimized.A mixed integer linear program is established to describe the problem.To obtain its Pareto solutions,we present a memetic algorithm that integrates a population-based nondominated sorting genetic algorithm II and two single-solution-based improvement methods,i.e.,an insertion-based local search and an iterated greedy algorithm.The computational results on extensive industrial data with the scale of a one-week schedule show that the proposed algorithm has great performance in solving the concerned problem and outperforms its peers.Its high accuracy and efficiency imply its great potential to be applied to solve industrial-size group scheduling problems.展开更多
In this study,We propose a compensated distributed adaptive learning algorithm for heterogeneous multi-agent systems with repetitive motion,where the leader's dynamics are unknown,and the controlled system's p...In this study,We propose a compensated distributed adaptive learning algorithm for heterogeneous multi-agent systems with repetitive motion,where the leader's dynamics are unknown,and the controlled system's parameters are uncertain.The multiagent systems are considered a kind of hybrid order nonlinear systems,which relaxes the strict requirement that all agents are of the same order in some existing work.For theoretical analyses,we design a composite energy function with virtual gain parameters to reduce the restriction that the controller gain depends on global information.Considering the stability of the controller,we introduce a smooth continuous function to improve the piecewise controller to avoid possible chattering.Theoretical analyses prove the convergence of the presented algorithm,and simulation experiments verify the effectiveness of the algorithm.展开更多
基金supported in part by the National Key Research and Development Program of China under Grant No.2021YFF0901300in part by the National Natural Science Foundation of China under Grant Nos.62173076 and 72271048.
文摘The distributed permutation flow shop scheduling problem(DPFSP)has received increasing attention in recent years.The iterated greedy algorithm(IGA)serves as a powerful optimizer for addressing such a problem because of its straightforward,single-solution evolution framework.However,a potential draw-back of IGA is the lack of utilization of historical information,which could lead to an imbalance between exploration and exploitation,especially in large-scale DPFSPs.As a consequence,this paper develops an IGA with memory and learning mechanisms(MLIGA)to efficiently solve the DPFSP targeted at the mini-malmakespan.InMLIGA,we incorporate a memory mechanism to make a more informed selection of the initial solution at each stage of the search,by extending,reconstructing,and reinforcing the information from previous solutions.In addition,we design a twolayer cooperative reinforcement learning approach to intelligently determine the key parameters of IGA and the operations of the memory mechanism.Meanwhile,to ensure that the experience generated by each perturbation operator is fully learned and to reduce the prior parameters of MLIGA,a probability curve-based acceptance criterion is proposed by combining a cube root function with custom rules.At last,a discrete adaptive learning rate is employed to enhance the stability of the memory and learningmechanisms.Complete ablation experiments are utilized to verify the effectiveness of the memory mechanism,and the results show that this mechanism is capable of improving the performance of IGA to a large extent.Furthermore,through comparative experiments involving MLIGA and five state-of-the-art algorithms on 720 benchmarks,we have discovered that MLI-GA demonstrates significant potential for solving large-scale DPFSPs.This indicates that MLIGA is well-suited for real-world distributed flow shop scheduling.
基金supported by the National Natural Science Foundation of China(61273070,61203092)the Enterprise-college-institute Cooperative Project of Jiangsu Province(BY2015019-21)+1 种基金111 Project(B12018)the Fun-damental Research Funds for the Central Universities(JUSRP51733B)
文摘For a class of non-uniform output sampling hybrid system with actuator faults and bounded disturbances,an iterative learning fault diagnosis algorithm is proposed.Firstly,in order to measure the impact of fault on system between every consecutive output sampling instants,the actual fault function is transformed to obtain an equivalent fault model by using the integral mean value theorem,then the non-uniform sampling hybrid system is converted to continuous systems with timevarying delay based on the output delay method.Afterwards,an observer-based fault diagnosis filter with virtual fault is designed to estimate the equivalent fault,and the iterative learning regulation algorithm is chosen to update the virtual fault repeatedly to make it approximate the actual equivalent fault after some iterative learning trials,so the algorithm can detect and estimate the system faults adaptively.Simulation results of an electro-mechanical control system model with different types of faults illustrate the feasibility and effectiveness of this algorithm.
基金supported by the National Natural Science Foundation of China (62173333, 12271522)Beijing Natural Science Foundation (Z210002)the Research Fund of Renmin University of China (2021030187)。
文摘For unachievable tracking problems, where the system output cannot precisely track a given reference, achieving the best possible approximation for the reference trajectory becomes the objective. This study aims to investigate solutions using the Ptype learning control scheme. Initially, we demonstrate the necessity of gradient information for achieving the best approximation.Subsequently, we propose an input-output-driven learning gain design to handle the imprecise gradients of a class of uncertain systems. However, it is discovered that the desired performance may not be attainable when faced with incomplete information.To address this issue, an extended iterative learning control scheme is introduced. In this scheme, the tracking errors are modified through output data sampling, which incorporates lowmemory footprints and offers flexibility in learning gain design.The input sequence is shown to converge towards the desired input, resulting in an output that is closest to the given reference in the least square sense. Numerical simulations are provided to validate the theoretical findings.
文摘为提升哈里斯鹰优化算法收敛精度,解决易陷入局部最优等问题,提出了一种基于迭代混沌精英反向学习和黄金正弦策略的哈里斯鹰优化算法(gold sine HHO,GSHHO)。利用无限迭代混沌映射初始化种群,运用精英反向学习策略筛选优质种群,提高种群质量,增强算法的全局搜索能力;使用一种收敛因子调整策略重新计算猎物能量,平衡算法的全局探索和局部开发能力;在哈里斯鹰的开发阶段引入黄金正弦策略,替换原有的位置更新方法,提升算法的局部开发能力;在9个测试函数和不同规模的栅格地图上评估GSHHO的有效性。实验结果表明:GSHHO在不同测试函数中具有较好的寻优精度和稳定性能,在2次机器人路径规划中路径长度较原始HHO算法分别减少4.4%、3.17%,稳定性分别提升52.98%、63.12%。
文摘In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural networkbased reinforcement learning, thereby potentially leading to more effective policy improvement.
基金This work was supported by the China Scholarship Council Scholarship,the National Key Research and Development Program of China(2017YFB0306400)the National Natural Science Foundation of China(62073069)the Deanship of Scientific Research(DSR)at King Abdulaziz University(RG-48-135-40).
文摘Group scheduling problems have attracted much attention owing to their many practical applications.This work proposes a new bi-objective serial-batch group scheduling problem considering the constraints of sequence-dependent setup time,release time,and due time.It is originated from an important industrial process,i.e.,wire rod and bar rolling process in steel production systems.Two objective functions,i.e.,the number of late jobs and total setup time,are minimized.A mixed integer linear program is established to describe the problem.To obtain its Pareto solutions,we present a memetic algorithm that integrates a population-based nondominated sorting genetic algorithm II and two single-solution-based improvement methods,i.e.,an insertion-based local search and an iterated greedy algorithm.The computational results on extensive industrial data with the scale of a one-week schedule show that the proposed algorithm has great performance in solving the concerned problem and outperforms its peers.Its high accuracy and efficiency imply its great potential to be applied to solve industrial-size group scheduling problems.
基金the National Natural Science Foundation of China(Grant Nos.62203342,62073254,92271101,62106186,and 62103136)the Fundamental Research Funds for the Central Universities(Grant Nos.XJS220704,QTZX23003,and ZYTS23046)+1 种基金the Project Funded by China Postdoctoral Science Foundation(Grant No.2022M712489)the Natural Science Basic Research Program of Shaanxi(Grant No.2023-JC-YB-585)。
文摘In this study,We propose a compensated distributed adaptive learning algorithm for heterogeneous multi-agent systems with repetitive motion,where the leader's dynamics are unknown,and the controlled system's parameters are uncertain.The multiagent systems are considered a kind of hybrid order nonlinear systems,which relaxes the strict requirement that all agents are of the same order in some existing work.For theoretical analyses,we design a composite energy function with virtual gain parameters to reduce the restriction that the controller gain depends on global information.Considering the stability of the controller,we introduce a smooth continuous function to improve the piecewise controller to avoid possible chattering.Theoretical analyses prove the convergence of the presented algorithm,and simulation experiments verify the effectiveness of the algorithm.