Computers & Chemical Engineering, Vol.29, No.7, 1647-1659, 2005
Variable selection and data pre-processing in NN modelling of complex chemical processes
The neural network models represent nowadays a powerful tool for complicated process identification. However, because of the fact that they belong to the category of data-driven "black box" models, they cannot avoid the consequences of the "garbage in-garbage out" rule. This work proposes a simultaneous data balancing-variable selection procedure, which is based on traditional statistical techniques and modem information theoretic approaches. It is implemented on a complicated dataset of restricted quality, which refers to a commercial aldol condensation unit (BASF). Based on the pre-processed database a neural model for the prediction of the process yield has been developed. The results verify the importance of the pre-processing stage in terms of generalization accuracy as well as of simpler network structure due to the data-variable selection procedure. Finally, an analysis of the model trends has been implemented to assess qualitative characteristics of the model. which was then used in industrial test runs and resulted in an improvement of the process operation. (c) 2005 Elsevier Ltd. All rights reserved.
Keywords:neural networks;process modelling;data preprocessing;variable selection;variable entropy;variable information