Electrical and Electronic Engineering - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    When to stop value iteration: stability and near-optimality versus computation
    Granzotto, M ; Postoyan, R ; Nešić, D ; Buşoniu, L ; Daafouz, J ( 2020-11-19)
    Value iteration (VI) is a ubiquitous algorithm for optimal control, planning, and reinforcement learning schemes. Under the right assumptions, VI is a vital tool to generate inputs with desirable properties for the controlled system, like optimality and Lyapunov stability. As VI usually requires an infinite number of iterations to solve general nonlinear optimal control problems, a key question is when to terminate the algorithm to produce a “good” solution, with a measurable impact on optimality and stability guarantees. By carefully analysing VI under general stabilizability and detectability properties, we provide explicit and novel relationships of the stopping criterion’s impact on near-optimality, stability and performance, thus allowing to tune these desirable properties against the induced computational cost. The considered class of stopping criteria encompasses those encountered in the control, dynamic programming and reinforcement learning literature and it allows considering new ones, which may be useful to further reduce the computational cost while endowing and satisfying stability and near-optimality properties. We therefore lay a foundation to endow machine learning schemes based on VI with stability and performance guarantees, while reducing computational complexity.
  • Item
    Thumbnail Image
    Exploiting homogeneity for the optimal control of discrete-time systems: application to value iteration
    Granzotto, M ; Postoyan, R ; Buşoniu, L ; Nešić, D ; Daafouz, J ( 2021-09-22)
    To investigate solutions of (near-)optimal control problems, we extend and exploit a notion of homogeneity recently proposed in the literature for discrete-time systems. Assuming the plant dynamics is homogeneous, we first derive a scaling property of its solutions along rays provided the sequence of inputs is suitably modified. We then consider homogeneous cost functions and reveal how the optimal value function scales along rays. This result can be used to construct (near-)optimal inputs on the whole state space by only solving the original problem on a given compact manifold of a smaller dimension. Compared to the related works of the literature, we impose no conditions on the homogeneity degrees. We demonstrate the strength of this new result by presenting a new approximate scheme for value iteration, which is one of the pillars of dynamic programming. The new algorithm provides guaranteed lower and upper estimates of the true value function at any iteration and has several appealing features in terms of reduced computation. A numerical case study is provided to illustrate the proposed algorithm.