Evaluation of reinforcement learning for optimal control of building active and passive thermal storage inventory

Liu SM; Henze GP

Journal of Solar Energy Engineering-Transactions of The ASME, Vol.129, No.2, 215-225, 2007

This paper describes an investigation of machine learning for supervisory control of active and passive thermal storage capacity in buildings. Previous studies show that the utilization of active or passive thermal storage, or both, can yield significant peak cooling load reduction and associated electrical demand and operational cost savings. In this,study, a model-free learning control is investigated for the operation of electrically driven chilled water systems in heavy-mass commercial buildings. The reinforcement learning controller learns to operate the building and cooling plant based on the reinforcement feedback (monetary cost of each action, in this study) it receives for past control actions. The learning agent interacts with its environment by commanding the global zone temperature setpoints and thermal energy storage charging/discharging rate. The controller extracts information about the environment based solely on the reinforcement signal; the controller does not contain a predictive or system model. Over time and by exploring the environment, the reinforcement learning controller establishes a statistical summary of plant operation, which is continuously updated as operation continues. The present analysis shows that learning control is a feasible methodology to find a near-optimal control strategy,for exploiting the active and passive building thermal storage capacity, and also shows that the, learning performance is affected by the dimensionality of the action and state space, the learning rate and several other factors. It is found that it takes a long time to learn control strategies for tasks associated with large state and action spaces.