In many research disciplines, hypothesis tests are applied to evaluate whether findings are statistically significant or could be explained by chance. The Wilcoxon-Mann-Whitney (WMW) test is among the most popular h...In many research disciplines, hypothesis tests are applied to evaluate whether findings are statistically significant or could be explained by chance. The Wilcoxon-Mann-Whitney (WMW) test is among the most popular hypothesis tests in medicine and life science to analyze if two groups of samples are equally distributed. This nonparametric statistical homogeneity test is commonly applied in molecular diagnosis. Generally, the solution of the WMW test takes a high combinatorial effort for large sample cohorts containing a significant number of ties. Hence, P value is frequently approximated by a normal distribution. We developed EDISON-WMW, a new approach to calcu- late the exact permutation of the two-tailed unpaired WMW test without any corrections required and allowing for ties. The method relies on dynamic programing to solve the combinatorial problem of the WMW test efficiently. Beyond a straightforward implementation of the algorithm, we pre- sented different optimization strategies and developed a parallel solution. Using our program, the exact P value for large cohorts containing more than 1000 samples with ties can be calculated within minutes. We demonstrate the performance of this novel approach on randomly-generated data, benchmark it against 13 other commonly-applied approaches and moreover evaluate molec- ular biomarkers for lung carcinoma and chronic obstructive pulmonary disease (COPD). We foundthat approximated P values were generally higher than the exact solution provided by EDISON- WMW. Importantly, the algorithm can also be applied to high-throughput omics datasets, where hundreds or thousands of features are included. To provide easy access to the multi-threaded version of EDISON-WMW, a web-based solution of our algorithm is freely available at http:// www.ccb.uni-saarland.de/software/wtest/.展开更多
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ...Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.展开更多
In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed metho...In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed method, termed as IMP-ADP, does not require complete state feedback-merely the measurement of input and output data. More specifically, based on the IMP, the output control problem can first be converted into a stabilization problem. We then design an observer to reproduce the full state of the system by measuring the inputs and outputs. Moreover, this technique includes both a policy iteration algorithm and a value iteration algorithm to determine the optimal feedback gain without using a dynamic system model. It is important that with this concept one does not need to solve the regulator equation. Finally, this control method was tested on an inverter system of grid-connected LCLs to demonstrate that the proposed method provides the desired performance in terms of both tracking and disturbance rejection.展开更多
The use of dynamic programming(DP)algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks.Therefore,this study propose...The use of dynamic programming(DP)algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks.Therefore,this study proposes a DP algorithm based on node block sequence constraints.The proposed algorithm constrains the traversal process of the parent graph by using the M-sequence matrix to considerably reduce the time consumption and space complexity by pruning the traversal process of the order graph using the node block sequence.Experimental results show that compared with existing DP algorithms,the proposed algorithm can obtain learning results more efficiently with less than 1%loss of accuracy,and can be used for learning larger-scale networks.展开更多
The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable ener...The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost.展开更多
This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is int...This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is introduced into the feedback system.However,due to the introduction of control input into the feedback system,the optimal state feedback control methods can not be applied directly.To address this problem,an augmented system and an augmented performance index function are proposed firstly.Thus,the general nonlinear system is transformed into an affine nonlinear system.The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically.It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function.Moreover,an adaptive dynamic programming(ADP)technique is utilized to implement the optimal parallel tracking control using a critic neural network(NN)to approximate the value function online.The stability analysis of the closed-loop system is performed using the Lyapunov theory,and the tracking error and NN weights errors are uniformly ultimately bounded(UUB).Also,the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals.Finally,the effectiveness of the developed optimal parallel control method is verified in two cases.展开更多
Three important aspects of phase-mining must be optimized:the number of phases,the geometry and location of each phase-pit(including the ultimate pit),and the ore and waste quantities to be mined in each phase.A model...Three important aspects of phase-mining must be optimized:the number of phases,the geometry and location of each phase-pit(including the ultimate pit),and the ore and waste quantities to be mined in each phase.A model is presented,in which a sequence of geologically optimum pits is first generated and then dynamically evaluated to simultaneously optimize the above three aspects,with the objective of maximizing the overall net present value.In this model,the dynamic nature of the problem is fully taken into account with respect to both time and space,and is robust in accommodating different pit wall slopes and different bench heights.The model is applied to a large deposit consisting of 2044 224 blocks and proved to be both efficient and practical.展开更多
Based on service-oriented architecture(SOA),a Bellman-dynamic-programming-based approach of service recovery decision-making is proposed to make valid recovery decisions.Both the attribute and the process of service...Based on service-oriented architecture(SOA),a Bellman-dynamic-programming-based approach of service recovery decision-making is proposed to make valid recovery decisions.Both the attribute and the process of services in the controllable distributed information system are analyzed as the preparatory work.Using the idea of service composition as a reference,the approach translates the recovery decision-making into a planning problem regarding artificial intelligence (AI) through two steps.The first is the self-organization based on a logical view of the network,and the second is the definition of evaluation standards.Applying Bellman dynamic programming to solve the planning problem,the approach offers timely emergency response and optimal recovery source selection,meeting multiple QoS (quality of service)requirements.Experimental results demonstrate the rationality and optimality of the approach,and the theoretical analysis of its computational complexity and the comparison with conventional methods exhibit its high efficiency.展开更多
This paper researches the adaptive scheduling problem of multiple electronic support measures(multi-ESM) in a ground moving radar targets tracking application. It is a sequential decision-making problem in uncertain e...This paper researches the adaptive scheduling problem of multiple electronic support measures(multi-ESM) in a ground moving radar targets tracking application. It is a sequential decision-making problem in uncertain environment. For adaptive selection of appropriate ESMs, we generalize an approximate dynamic programming(ADP) framework to the dynamic case. We define the environment model and agent model, respectively. To handle the partially observable challenge, we apply the unsented Kalman filter(UKF) algorithm for belief state estimation. To reduce the computational burden, a simulation-based approach rollout with a redesigned base policy is proposed to approximate the long-term cumulative reward. Meanwhile, Monte Carlo sampling is combined into the rollout to estimate the expectation of the rewards. The experiments indicate that our method outperforms other strategies due to its better performance in larger-scale problems.展开更多
Unmanned aerial vehicles(UAVs) may play an important role in data collection and offloading in vast areas deploying wireless sensor networks, and the UAV’s action strategy has a vital influence on achieving applicabi...Unmanned aerial vehicles(UAVs) may play an important role in data collection and offloading in vast areas deploying wireless sensor networks, and the UAV’s action strategy has a vital influence on achieving applicability and computational complexity. Dynamic programming(DP) has a good application in the path planning of UAV, but there are problems in the applicability of special terrain environment and the complexity of the algorithm.Based on the analysis of DP, this paper proposes a hierarchical directional DP(DDP) algorithm based on direction determination and hierarchical model. We compare our methods with Q-learning and DP algorithm by experiments, and the results show that our method can improve the terrain applicability, meanwhile greatly reduce the computational complexity.展开更多
A stochastic resource allocation model, based on the principles of Markov decision processes(MDPs), is proposed in this paper. In particular, a general-purpose framework is developed, which takes into account resource...A stochastic resource allocation model, based on the principles of Markov decision processes(MDPs), is proposed in this paper. In particular, a general-purpose framework is developed, which takes into account resource requests for both instant and future needs. The considered framework can handle two types of reservations(i.e., specified and unspecified time interval reservation requests), and implement an overbooking business strategy to further increase business revenues. The resulting dynamic pricing problems can be regarded as sequential decision-making problems under uncertainty, which is solved by means of stochastic dynamic programming(DP) based algorithms. In this regard, Bellman’s backward principle of optimality is exploited in order to provide all the implementation mechanisms for the proposed reservation pricing algorithm. The curse of dimensionality, as the inevitable issue of the DP both for instant resource requests and future resource reservations,occurs. In particular, an approximate dynamic programming(ADP) technique based on linear function approximations is applied to solve such scalability issues. Several examples are provided to show the effectiveness of the proposed approach.展开更多
As valuable energy in iron-and steel-making process,by-product gas is widely used in heating and technical processes in steel plant.After being used according to the technical requirements,the surplus by-product gas i...As valuable energy in iron-and steel-making process,by-product gas is widely used in heating and technical processes in steel plant.After being used according to the technical requirements,the surplus by-product gas is usually used for buffer boilers to produce steam.With the rapid development of energy conservation technology and energy consumption level,surplus gas in steel plant continues to get larger.Therefore,it is significant to organize surplus gas among buffer boilers.A dynamic programming model of that issue was established in this work,considering the ramp rate constraint of boilers and the influences of setting gasholders.Then a case study was done.It is shown that dynamic programming dispatch gets more steam generation and less specific gas consumption compared with current proportionate dispatch depending on nominal capacities of boilers.The ignored boiler ramp rate constraint was considered and its contribution to the result validity was pointed out.Finally,the significance of setting gasholders was studied.展开更多
A method of minimizing rankings inconsistency is proposed for a decision-making problem with rankings of alternatives given by multiple decision makers according to multiple criteria. For each criteria, at first, the ...A method of minimizing rankings inconsistency is proposed for a decision-making problem with rankings of alternatives given by multiple decision makers according to multiple criteria. For each criteria, at first, the total inconsistency between the rankings of all alternatives for the group and the ones for every decision maker is defined after the decision maker weights in respect to the criteria are considered. Similarly, the total inconsistency between their final rankings for the group and the ones under every criteria is determined after the criteria weights are taken into account. Then two nonlinear integer programming models minimizing respectively the two total inconsistencies above are developed and then transformed to two dynamic programming models to obtain separately the rankings of all alternatives for the group with respect to each criteria and their final rankings. A supplier selection case illustrated the proposed method, and some discussions on the results verified its effectiveness. This work develops a new measurement of ordinal preferences’ inconsistency in multi-criteria group decision-making (MCGDM) and extends the cook-seiford social selection function to MCGDM considering weights of criteria and decision makers and can obtain unique ranking result.展开更多
A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking prob...A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking problem is transformed into an optimal regulation one. The policy iteration algorithm for discrete-time chaotic systems is first described. Then,the convergence and admissibility properties of the developed policy iteration algorithm are presented, which show that the transformed chaotic system can be stabilized under an arbitrary iterative control law and the iterative performance index function simultaneously converges to the optimum. By implementing the policy iteration algorithm via neural networks,the developed optimal tracking control scheme for chaotic systems is verified by a simulation.展开更多
This paper presents a hierarchical dynamic routing protocol (HDRP) based on the discrete dynamic programming principle. The proposed protocol can adapt to the dynamic and large computer networks (DLCN) with clustering...This paper presents a hierarchical dynamic routing protocol (HDRP) based on the discrete dynamic programming principle. The proposed protocol can adapt to the dynamic and large computer networks (DLCN) with clustering topology. The procedures for realizing routing update and decision are presented in this paper. The proof of correctness and complexity analysis of the protocol are also made. The performance measures of the HDRP including throughput and average message delay are evaluated by using of simulation. The study shows that the HDRP provides a new available approach to the routing decision for DLCN or high speed networks with clustering topology.展开更多
A certain number of considerations should be taken into account in the dynamic control of robot manipulators as highly complex non-linear systems.In this article,we provide a detailed presentation of the mechanical an...A certain number of considerations should be taken into account in the dynamic control of robot manipulators as highly complex non-linear systems.In this article,we provide a detailed presentation of the mechanical and electrical impli- cations of robots equipped with DC motor actuators.This model takes into account all non-linear aspects of the system.Then,we develop computational algorithms for optimal control based on dynamic programming.The robot's trajectory must be predefined,but performance criteria and constraints applying to the system are not limited and we may adapt them freely to the robot and the task being studied.As an example,a manipulator arm with 3 degrees of freedom is analyzed.展开更多
Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are co...Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are constructed with sequential conditional probability in HSI color space. Then, dynamic programming is used to seek the best color mapping relation with the minimum cost path between target image histogram and source image histogram. Finally, video tracking technique is performed to correct multi-view video. Experimental results show that the proposed method can obtain better subjective and objective performance in color correction.展开更多
Replicas can improve the data reliability in distributed system. However, the traditional algorithms for replica management are based on the assumption that all replicas have the uniform reliability, which is inaccura...Replicas can improve the data reliability in distributed system. However, the traditional algorithms for replica management are based on the assumption that all replicas have the uniform reliability, which is inaccurate in some actual systems. To address such problem, a novel algorithm is proposed based on dynamic programming to manage the number and distribution of replicas in different nodes. By using Markov model, replicas management is organized as a multi-phase process, and the recursion equations are provided. In this algorithm, the heterogeneity of nodes, the expense for maintaining replicas and the engaged space have been considered. Under these restricted conditions, this algorithm realizes high data reliability in a distributed system. The results of case analysis prove the feasibility of the algorithm.展开更多
In short-term operation of natural gas network,the impact of demand uncertainty is not negligible.To address this issue we propose a two-stage robust model for power cost minimization problem in gunbarrel natural gas ...In short-term operation of natural gas network,the impact of demand uncertainty is not negligible.To address this issue we propose a two-stage robust model for power cost minimization problem in gunbarrel natural gas networks.The demands between pipelines and compressor stations are uncertain with a budget parameter,since it is unlikely that all the uncertain demands reach the maximal deviation simultaneously.During solving the two-stage robust model we encounter a bilevel problem which is challenging to solve.We formulate it as a multi-dimensional dynamic programming problem and propose approximate dynamic programming methods to accelerate the calculation.Numerical results based on real network in China show that we obtain a speed gain of 7 times faster in average without compromising optimality compared with original dynamic programming algorithm.Numerical results also verify the advantage of robust model compared with deterministic model when facing uncertainties.These findings offer short-term operation methods for gunbarrel natural gas network management to handle with uncertainties.展开更多
文摘In many research disciplines, hypothesis tests are applied to evaluate whether findings are statistically significant or could be explained by chance. The Wilcoxon-Mann-Whitney (WMW) test is among the most popular hypothesis tests in medicine and life science to analyze if two groups of samples are equally distributed. This nonparametric statistical homogeneity test is commonly applied in molecular diagnosis. Generally, the solution of the WMW test takes a high combinatorial effort for large sample cohorts containing a significant number of ties. Hence, P value is frequently approximated by a normal distribution. We developed EDISON-WMW, a new approach to calcu- late the exact permutation of the two-tailed unpaired WMW test without any corrections required and allowing for ties. The method relies on dynamic programing to solve the combinatorial problem of the WMW test efficiently. Beyond a straightforward implementation of the algorithm, we pre- sented different optimization strategies and developed a parallel solution. Using our program, the exact P value for large cohorts containing more than 1000 samples with ties can be calculated within minutes. We demonstrate the performance of this novel approach on randomly-generated data, benchmark it against 13 other commonly-applied approaches and moreover evaluate molec- ular biomarkers for lung carcinoma and chronic obstructive pulmonary disease (COPD). We foundthat approximated P values were generally higher than the exact solution provided by EDISON- WMW. Importantly, the algorithm can also be applied to high-throughput omics datasets, where hundreds or thousands of features are included. To provide easy access to the multi-threaded version of EDISON-WMW, a web-based solution of our algorithm is freely available at http:// www.ccb.uni-saarland.de/software/wtest/.
基金supported in part by the National Natural Science Foundation of China(62222301, 62073085, 62073158, 61890930-5, 62021003)the National Key Research and Development Program of China (2021ZD0112302, 2021ZD0112301, 2018YFC1900800-5)Beijing Natural Science Foundation (JQ19013)。
文摘Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.
基金supported by the National Science Fund for Distinguished Young Scholars (62225303)the Fundamental Research Funds for the Central Universities (buctrc202201)+1 种基金China Scholarship Council,and High Performance Computing PlatformCollege of Information Science and Technology,Beijing University of Chemical Technology。
文摘In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed method, termed as IMP-ADP, does not require complete state feedback-merely the measurement of input and output data. More specifically, based on the IMP, the output control problem can first be converted into a stabilization problem. We then design an observer to reproduce the full state of the system by measuring the inputs and outputs. Moreover, this technique includes both a policy iteration algorithm and a value iteration algorithm to determine the optimal feedback gain without using a dynamic system model. It is important that with this concept one does not need to solve the regulator equation. Finally, this control method was tested on an inverter system of grid-connected LCLs to demonstrate that the proposed method provides the desired performance in terms of both tracking and disturbance rejection.
基金Shaanxi Science Fund for Distinguished Young Scholars,Grant/Award Number:2024JC-JCQN-57Xi’an Science and Technology Plan Project,Grant/Award Number:2023JH-QCYJQ-0086+2 种基金Scientific Research Program Funded by Education Department of Shaanxi Provincial Government,Grant/Award Number:P23JP071Engineering Technology Research Center of Shaanxi Province for Intelligent Testing and Reliability Evaluation of Electronic Equipments,Grant/Award Number:2023-ZC-GCZX-00472022 Shaanxi University Youth Innovation Team Project。
文摘The use of dynamic programming(DP)algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks.Therefore,this study proposes a DP algorithm based on node block sequence constraints.The proposed algorithm constrains the traversal process of the parent graph by using the M-sequence matrix to considerably reduce the time consumption and space complexity by pruning the traversal process of the order graph using the node block sequence.Experimental results show that compared with existing DP algorithms,the proposed algorithm can obtain learning results more efficiently with less than 1%loss of accuracy,and can be used for learning larger-scale networks.
基金supported in part by the National Natural Science Foundation of China(61533017,U1501251,61374105,61722312)
文摘The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost.
基金supported in part by the National Key Reseanch and Development Program of China(2018AAA0101502,2018YFB1702300)in part by the National Natural Science Foundation of China(61722312,61533019,U1811463,61533017)in part by the Intel Collaborative Research Institute for Intelligent and Automated Connected Vehicles。
文摘This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is introduced into the feedback system.However,due to the introduction of control input into the feedback system,the optimal state feedback control methods can not be applied directly.To address this problem,an augmented system and an augmented performance index function are proposed firstly.Thus,the general nonlinear system is transformed into an affine nonlinear system.The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically.It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function.Moreover,an adaptive dynamic programming(ADP)technique is utilized to implement the optimal parallel tracking control using a critic neural network(NN)to approximate the value function online.The stability analysis of the closed-loop system is performed using the Lyapunov theory,and the tracking error and NN weights errors are uniformly ultimately bounded(UUB).Also,the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals.Finally,the effectiveness of the developed optimal parallel control method is verified in two cases.
基金Project(50974041) supported by the National Natural Science Foundation of ChinaProject(20090042120040) supported by the Doctoral Program Foundation of the Ministry of Education, ChinaProject(20093910) supported by the Natural Science Foundation of Liaoning Province, China
文摘Three important aspects of phase-mining must be optimized:the number of phases,the geometry and location of each phase-pit(including the ultimate pit),and the ore and waste quantities to be mined in each phase.A model is presented,in which a sequence of geologically optimum pits is first generated and then dynamically evaluated to simultaneously optimize the above three aspects,with the objective of maximizing the overall net present value.In this model,the dynamic nature of the problem is fully taken into account with respect to both time and space,and is robust in accommodating different pit wall slopes and different bench heights.The model is applied to a large deposit consisting of 2044 224 blocks and proved to be both efficient and practical.
文摘Based on service-oriented architecture(SOA),a Bellman-dynamic-programming-based approach of service recovery decision-making is proposed to make valid recovery decisions.Both the attribute and the process of services in the controllable distributed information system are analyzed as the preparatory work.Using the idea of service composition as a reference,the approach translates the recovery decision-making into a planning problem regarding artificial intelligence (AI) through two steps.The first is the self-organization based on a logical view of the network,and the second is the definition of evaluation standards.Applying Bellman dynamic programming to solve the planning problem,the approach offers timely emergency response and optimal recovery source selection,meeting multiple QoS (quality of service)requirements.Experimental results demonstrate the rationality and optimality of the approach,and the theoretical analysis of its computational complexity and the comparison with conventional methods exhibit its high efficiency.
基金supported by the National Natural Science Foundation of China(6157328561305133)
文摘This paper researches the adaptive scheduling problem of multiple electronic support measures(multi-ESM) in a ground moving radar targets tracking application. It is a sequential decision-making problem in uncertain environment. For adaptive selection of appropriate ESMs, we generalize an approximate dynamic programming(ADP) framework to the dynamic case. We define the environment model and agent model, respectively. To handle the partially observable challenge, we apply the unsented Kalman filter(UKF) algorithm for belief state estimation. To reduce the computational burden, a simulation-based approach rollout with a redesigned base policy is proposed to approximate the long-term cumulative reward. Meanwhile, Monte Carlo sampling is combined into the rollout to estimate the expectation of the rewards. The experiments indicate that our method outperforms other strategies due to its better performance in larger-scale problems.
基金supported by the National Natural Science Foundation of China(91648204 61601486)+1 种基金State Key Laboratory of High Performance Computing Project Fund(1502-02)Research Programs of National University of Defense Technology(ZDYYJCYJ140601)
文摘Unmanned aerial vehicles(UAVs) may play an important role in data collection and offloading in vast areas deploying wireless sensor networks, and the UAV’s action strategy has a vital influence on achieving applicability and computational complexity. Dynamic programming(DP) has a good application in the path planning of UAV, but there are problems in the applicability of special terrain environment and the complexity of the algorithm.Based on the analysis of DP, this paper proposes a hierarchical directional DP(DDP) algorithm based on direction determination and hierarchical model. We compare our methods with Q-learning and DP algorithm by experiments, and the results show that our method can improve the terrain applicability, meanwhile greatly reduce the computational complexity.
文摘A stochastic resource allocation model, based on the principles of Markov decision processes(MDPs), is proposed in this paper. In particular, a general-purpose framework is developed, which takes into account resource requests for both instant and future needs. The considered framework can handle two types of reservations(i.e., specified and unspecified time interval reservation requests), and implement an overbooking business strategy to further increase business revenues. The resulting dynamic pricing problems can be regarded as sequential decision-making problems under uncertainty, which is solved by means of stochastic dynamic programming(DP) based algorithms. In this regard, Bellman’s backward principle of optimality is exploited in order to provide all the implementation mechanisms for the proposed reservation pricing algorithm. The curse of dimensionality, as the inevitable issue of the DP both for instant resource requests and future resource reservations,occurs. In particular, an approximate dynamic programming(ADP) technique based on linear function approximations is applied to solve such scalability issues. Several examples are provided to show the effectiveness of the proposed approach.
基金Project(L2012082)supported by the Science and Technology Research Funds of Liaoning Provincial Education Department,China
文摘As valuable energy in iron-and steel-making process,by-product gas is widely used in heating and technical processes in steel plant.After being used according to the technical requirements,the surplus by-product gas is usually used for buffer boilers to produce steam.With the rapid development of energy conservation technology and energy consumption level,surplus gas in steel plant continues to get larger.Therefore,it is significant to organize surplus gas among buffer boilers.A dynamic programming model of that issue was established in this work,considering the ramp rate constraint of boilers and the influences of setting gasholders.Then a case study was done.It is shown that dynamic programming dispatch gets more steam generation and less specific gas consumption compared with current proportionate dispatch depending on nominal capacities of boilers.The ignored boiler ramp rate constraint was considered and its contribution to the result validity was pointed out.Finally,the significance of setting gasholders was studied.
基金supported by the National Natural Science Foundation of China (60904059 60975049)+1 种基金the Philosophy and Social Science Foundation of Hunan Province (2010YBA104)the National High Technology Research and Development Program of China (863 Program)(2009AA04Z107)
文摘A method of minimizing rankings inconsistency is proposed for a decision-making problem with rankings of alternatives given by multiple decision makers according to multiple criteria. For each criteria, at first, the total inconsistency between the rankings of all alternatives for the group and the ones for every decision maker is defined after the decision maker weights in respect to the criteria are considered. Similarly, the total inconsistency between their final rankings for the group and the ones under every criteria is determined after the criteria weights are taken into account. Then two nonlinear integer programming models minimizing respectively the two total inconsistencies above are developed and then transformed to two dynamic programming models to obtain separately the rankings of all alternatives for the group with respect to each criteria and their final rankings. A supplier selection case illustrated the proposed method, and some discussions on the results verified its effectiveness. This work develops a new measurement of ordinal preferences’ inconsistency in multi-criteria group decision-making (MCGDM) and extends the cook-seiford social selection function to MCGDM considering weights of criteria and decision makers and can obtain unique ranking result.
基金supported by the National Natural Science Foundation of China(Grant Nos.61034002,61233001,61273140,61304086,and 61374105)the Beijing Natural Science Foundation,China(Grant No.4132078)
文摘A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking problem is transformed into an optimal regulation one. The policy iteration algorithm for discrete-time chaotic systems is first described. Then,the convergence and admissibility properties of the developed policy iteration algorithm are presented, which show that the transformed chaotic system can be stabilized under an arbitrary iterative control law and the iterative performance index function simultaneously converges to the optimum. By implementing the policy iteration algorithm via neural networks,the developed optimal tracking control scheme for chaotic systems is verified by a simulation.
文摘This paper presents a hierarchical dynamic routing protocol (HDRP) based on the discrete dynamic programming principle. The proposed protocol can adapt to the dynamic and large computer networks (DLCN) with clustering topology. The procedures for realizing routing update and decision are presented in this paper. The proof of correctness and complexity analysis of the protocol are also made. The performance measures of the HDRP including throughput and average message delay are evaluated by using of simulation. The study shows that the HDRP provides a new available approach to the routing decision for DLCN or high speed networks with clustering topology.
文摘A certain number of considerations should be taken into account in the dynamic control of robot manipulators as highly complex non-linear systems.In this article,we provide a detailed presentation of the mechanical and electrical impli- cations of robots equipped with DC motor actuators.This model takes into account all non-linear aspects of the system.Then,we develop computational algorithms for optimal control based on dynamic programming.The robot's trajectory must be predefined,but performance criteria and constraints applying to the system are not limited and we may adapt them freely to the robot and the task being studied.As an example,a manipulator arm with 3 degrees of freedom is analyzed.
基金supported by the National Natural Science Foundation of China (60672073)the Program for New Century Excellent Talents in University (NCET-06-0537)+1 种基金the Natural Science Foundation of Ningbo (2008A610016)the K.C.Wong Magna Fund in Ningbo University.
文摘Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are constructed with sequential conditional probability in HSI color space. Then, dynamic programming is used to seek the best color mapping relation with the minimum cost path between target image histogram and source image histogram. Finally, video tracking technique is performed to correct multi-view video. Experimental results show that the proposed method can obtain better subjective and objective performance in color correction.
文摘Replicas can improve the data reliability in distributed system. However, the traditional algorithms for replica management are based on the assumption that all replicas have the uniform reliability, which is inaccurate in some actual systems. To address such problem, a novel algorithm is proposed based on dynamic programming to manage the number and distribution of replicas in different nodes. By using Markov model, replicas management is organized as a multi-phase process, and the recursion equations are provided. In this algorithm, the heterogeneity of nodes, the expense for maintaining replicas and the engaged space have been considered. Under these restricted conditions, this algorithm realizes high data reliability in a distributed system. The results of case analysis prove the feasibility of the algorithm.
基金partially supported by the National Science Foundation of China(Grants 71822105 and 91746210)。
文摘In short-term operation of natural gas network,the impact of demand uncertainty is not negligible.To address this issue we propose a two-stage robust model for power cost minimization problem in gunbarrel natural gas networks.The demands between pipelines and compressor stations are uncertain with a budget parameter,since it is unlikely that all the uncertain demands reach the maximal deviation simultaneously.During solving the two-stage robust model we encounter a bilevel problem which is challenging to solve.We formulate it as a multi-dimensional dynamic programming problem and propose approximate dynamic programming methods to accelerate the calculation.Numerical results based on real network in China show that we obtain a speed gain of 7 times faster in average without compromising optimality compared with original dynamic programming algorithm.Numerical results also verify the advantage of robust model compared with deterministic model when facing uncertainties.These findings offer short-term operation methods for gunbarrel natural gas network management to handle with uncertainties.