Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems

Vamvoudakis KG

Automatica, Vol.61, 274-281, 2015

DOI10.1016/j.automatica.2015.08.017 Export Citation

Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems

This work proposes a novel Q-learning algorithm to solve the problem of non-zero sum Nash games of linear time invariant systems with N-players (control inputs) and centralized uncertain/unknown dynamics. We first formulate the Q-function of each player as a parametrization of the state and all other the control inputs or players. An integral reinforcement learning approach is used to develop a model-free structure of N-actors/N-critics to estimate the parameters of the N-coupled Q-functions online while also guaranteeing closed-loop stability and convergence of the control policies to a Nash equilibrium. A 4th order, simulation example with five players is presented to show the efficacy of the proposed approach. (C) 2015 Elsevier Ltd. All rights reserved.

Keywords:Q-learning;Nash-games;Uncertain systems;Model-free formulation