Electrical and Electronic Engineering - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 8 of 8
  • Item
    No Preview Available
    Regularizing policy iteration for recursive feasibility and stability
    Granzotto, M ; de Silva, OL ; Postoyan, R ; Nesic, D ; Jiang, Z-P (IEEE, 2022)
  • Item
    Thumbnail Image
    Exploiting homogeneity for the optimal control of discrete-time systems: Application to value iteration
    Granzotto, M ; Postoyan, R ; Busoniu, L ; Nesic, D ; Daafouz, J (IEEE, 2021)
    To investigate solutions of (near-)optimal control problems, we extend and exploit a notion of homogeneity recently proposed in the literature for discrete-time systems. Assuming the plant dynamics is homogeneous, we first derive a scaling property of its solutions along rays provided the sequence of inputs is suitably modified. We then consider homogeneous cost functions and reveal how the optimal value function scales along rays. This result can be used to construct (near-)optimal inputs on the whole state space by only solving the original problem on a given compact manifold of a smaller dimension. Compared to the related works of the literature, we impose no conditions on the homogeneity degrees. We demonstrate the strength of this new result by presenting a new approximate scheme for value iteration, which is one of the pillars of dynamic programming. The new algorithm provides guaranteed lower and upper estimates of the true value function at any iteration and has several appealing features in terms of reduced computation. A numerical case study is provided to illustrate the proposed algorithm.
  • Item
    Thumbnail Image
    When to stop value iteration: stability and near-optimality versus computation
    Granzotto, M ; Postoyan, R ; Nešić, D ; Buşoniu, L ; Daafouz, J ( 2020-11-19)
    Value iteration (VI) is a ubiquitous algorithm for optimal control, planning, and reinforcement learning schemes. Under the right assumptions, VI is a vital tool to generate inputs with desirable properties for the controlled system, like optimality and Lyapunov stability. As VI usually requires an infinite number of iterations to solve general nonlinear optimal control problems, a key question is when to terminate the algorithm to produce a “good” solution, with a measurable impact on optimality and stability guarantees. By carefully analysing VI under general stabilizability and detectability properties, we provide explicit and novel relationships of the stopping criterion’s impact on near-optimality, stability and performance, thus allowing to tune these desirable properties against the induced computational cost. The considered class of stopping criteria encompasses those encountered in the control, dynamic programming and reinforcement learning literature and it allows considering new ones, which may be useful to further reduce the computational cost while endowing and satisfying stability and near-optimality properties. We therefore lay a foundation to endow machine learning schemes based on VI with stability and performance guarantees, while reducing computational complexity.
  • Item
    No Preview Available
    Stable Near-Optimal Control of Nonlinear Switched Discrete-Time Systems: An Optimistic Planning-Based Approach
    Granzotto, M ; Postoyan, R ; Busoniu, L ; Nesic, D ; Daafouz, J (IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2022-05)
    Originating in the artificial intelligence literature, optimistic planning(OP) is an algorithm that generates near-optimal control inputs for generic nonlinear discrete-time systems whose input set is finite. This technique is therefore relevant for the near-optimal control of nonlinear switched systems for which the switching signal is the control, and no continuous input is present. However, OP exhibits several limitations, which prevent its desired application in a standard control engineering context, as it requires for instance that the stage cost takes values in [0,1], an unnatural prerequisite, and that the cost function be discounted. In this paper, we modify OP to overcome these limitations, and we call the new algorithm OPmin. New near-optimality and performance guarantees for OPmin are derived, which have major advantages compared to those originally given for OP. We also prove that a system whose inputs are generated by OPmin in a receding-horizon fashion exhibits stability properties. As a result, OPmin provides a new tool for the near-optimal, stable control of nonlinear switched discrete-time systems for generic cost functions.
  • Item
    Thumbnail Image
    Finite-horizon discounted optimal control: stability and performance
    Granzotto, M ; Postoyan, R ; Busoniu, L ; Nesic, D ; Daafouz, J (Institute of Electrical and Electronics Engineers (IEEE), 2021)
    Motivated by (approximate) dynamic programming and model predictive control problems, we analyse the stability of deterministic nonlinear discrete-time systems whose inputs minimize a discounted finite-horizon cost. We assume that the system satisfies stabilizability and detectability properties with respect to the stage cost. Then, a Lyapunov function for the closed-loop system is constructed and a uniform semiglobal stability property is ensured, where the adjustable parameters are both the discount factor and the horizon length, which corresponds to the number of iterations for dynamic programming algorithms like value iteration. Stronger stability properties such as global exponential stability are also provided by strengthening the initial assumptions. We give bounds on the discount factor and the horizon length under which stability holds. In addition, we provide new relationships between the optimal value functions of the discounted, undiscounted, infinite-horizon and finite-horizon costs respectively, which appear to be very different from those available in the literature.
  • Item
    Thumbnail Image
    Exploiting homogeneity for the optimal control of discrete-time systems: application to value iteration
    Granzotto, M ; Postoyan, R ; Buşoniu, L ; Nešić, D ; Daafouz, J ( 2021-09-22)
    To investigate solutions of (near-)optimal control problems, we extend and exploit a notion of homogeneity recently proposed in the literature for discrete-time systems. Assuming the plant dynamics is homogeneous, we first derive a scaling property of its solutions along rays provided the sequence of inputs is suitably modified. We then consider homogeneous cost functions and reveal how the optimal value function scales along rays. This result can be used to construct (near-)optimal inputs on the whole state space by only solving the original problem on a given compact manifold of a smaller dimension. Compared to the related works of the literature, we impose no conditions on the homogeneity degrees. We demonstrate the strength of this new result by presenting a new approximate scheme for value iteration, which is one of the pillars of dynamic programming. The new algorithm provides guaranteed lower and upper estimates of the true value function at any iteration and has several appealing features in terms of reduced computation. A numerical case study is provided to illustrate the proposed algorithm.
  • Item
    Thumbnail Image
    Optimistic planning for the near-optimal control of nonlinear switched discrete-time systems with stability guarantees
    Granzotto, M ; Postoyan, R ; Busoniu, L ; Nesic, D ; Daafouz, J (IEEE, 2020-03-12)
    Originating in the artificial intelligence literature, optimistic planning (OP) is an algorithm that generates near-optimal control inputs for generic nonlinear discrete-time systems whose input set is finite. This technique is therefore relevant for the near-optimal control of nonlinear switched systems, for which the switching signal is the control. However, OP exhibits several limitations, which prevent its application in a standard control context. First, it requires the stage cost to take values in [0, 1], an unnatural prerequisite as it excludes, for instance, quadratic stage costs. Second, it requires the cost function to be discounted. Third, it applies for reward maximization, and not cost minimization. In this paper, we modify OP to overcome these limitations, and we call the new algorithm OPmin. We then make stabilizability and detectability assumptions, under which we derive near-optimality guarantees for OPmin and we show that the obtained bound has major advantages compared to the bound originally given by OP. In addition, we prove that a system whose inputs are generated by OPmin in a receding-horizon fashion exhibits stability properties. As a result, OPmin provides a new tool for the near-optimal, stable control of nonlinear switched discrete-time systems for generic cost functions.
  • Item
    Thumbnail Image
    Stability guarantees for nonlinear discrete-time systems controlled by approximate value iteration
    Postoyan, R ; Granzotto, M ; Busoniu, L ; Scherrer, B ; Nesic, D ; Daafouz, J (IEEE, 2020-03-12)
    Value iteration is a method to generate optimal control inputs for generic nonlinear systems and cost functions. Its implementation typically leads to approximation errors, which may have a major impact on the closed-loop system performance. We talk in this case of approximate value iteration (AVI). In this paper, we investigate the stability of systems for which the inputs are obtained by AVI. We consider deterministic discrete-time nonlinear plants and a class of general, possibly discounted, costs. We model the closed-loop system as a family of systems parameterized by tunable parameters, which are used for the approximation of the value function at different iterations, the discount factor and the iteration step at which we stop running the algorithm. It is shown, under natural stabilizability and detectability properties as well as mild conditions on the approximation errors, that the family of closed-loop systems exhibit local practical stability properties. The analysis is based on the construction of a Lyapunov function given by the sum of the approximate value function and the Lyapunov-like function that characterizes the detectability of the system. By strengthening our conditions, asymptotic and exponential stability properties are guaranteed.