Energy and Buildings, Vol.196, 71-82, 2019
A methodology for energy multivariate time series forecasting in smart buildings based on feature selection
The massive collection of data via emerging technologies like the Internet of Things (IoT) requires finding optimal ways to reduce the created features that have a potential impact on the information that can be extracted through the machine learning process. The mining of knowledge related to a concept is done on the basis of the features of data. The process of finding the best combination of features is called feature selection. In this paper we deal with multivariate time-dependent series of data points for energy forecasting in smart buildings. We propose a methodology to transform the time-dependent database into a structure that standard machine learning algorithms can process, and then, apply different types of feature selection methods for regression tasks. We used Weka for the tasks of database transformation, feature selection, regression, statistical test and forecasting. The proposed methodology improves MAE by 59.97% and RMSE by 40.75%, evaluated on training data, and it improves MAE by 42.28% and RMSE by 36.62%, evaluated on test data, on average for 1-step-ahead, 2-step-ahead and 3-step-ahead when compared to not applying any feature selection methodology. (C) 2019 Elsevier B.V. All rights reserved.