Energy and Buildings, Vol.183, 428-442, 2019
A systematic feature selection procedure for short-term data-driven building energy forecasting model development
An accurate building energy forecasting model is the key for real-time model based control of building energy systems and building-grid integration. Data-driven models, though have lower engineering cost during their development process, often suffer from poor model generalization caused by high data dimensionality. Feature selection, a process of selecting a subset of relevant features, can defy high dimensionality, increase model interpretability, and enhance model generalization. In building energy modeling research, features are often selected based on domain knowledge. There lacks a comprehensive methodology to guide a systematic feature selection procedure when developing building energy forecasting models. In this research, a systematic feature selection procedure for developing a building energy forecasting model is proposed which attempts to integrate statistical analysis, building physics and engineering experiences. The proposed procedure includes three steps, i.e., (Step 1) feature pre-processing based on domain knowledge, (Step 2) feature removal through filter methods to remove irrelevant and redundant variables, and (Step 3) feature grouping through wrapper method to search for the best feature set. Two case studies are presented here using both simulated and real building data. The simulated building data are generated from a medium-size office building (a DOE reference building) simulation model. The real building data are obtained from a medium-size campus building in Philadelphia, PA. In both cases, the energy forecasting models that are developed using proposed systematic feature selection procedure is compared with models using other feature selection techniques. Results show that the models developed using proposed procedure have better accuracy and generalization. (C) 2018 Elsevier B.V. All rights reserved.