Energy, Vol.133, 348-365, 2017
Deep transfer Q-learning with virtual leader-follower for supply-demand Stackelberg game of smart grid
This paper proposes a novel deep transfer Q-learning (DTQ) associated with a virtual leader-follower pattern for supply-demand Stackelberg game of smart grid. Each generator and load are regarded as an agent of a supplier and a demander, respectively, in which an economic dispatch (ED) and demand response (DR) can be simultaneously solved by DTQ To maximize the total payoff of all the agents, a virtual leader-follower pattern is employed to achieve a reliable collaboration among the agents. Then, Q-learning with a cooperative swarm is adopted for the knowledge learning for each agent via appropriate explorations and exploitations in an unknown environment. Furthermore, the original extremely large-scale knowledge matrix can be efficiently decomposed into several simplified small-scale knowledge matrices through a binary state-action chain, while the continuous actions can be generated for continuous variables. Lastly, a deep belief network (DBN) is used for knowledge transfer, thus DTQ can effectively exploit the prior knowledge from source tasks so as to rapidly obtain an optimal solution of a new task. Case studies are carried out to evaluate the performance of DTQ for supply-demand Stackelberg game of smart grid on a 94-agent system and a practical Shenzhen power grid of southern China. (C) 2017 Published by Elsevier Ltd.