Industrial & Engineering Chemistry Research, Vol.56, No.16, 4804-4817, 2017
Active Selection of Informative Data for Sequential Quality Enhancement of Soft Sensor Models with Latent Variables
With training data of insufficient information, soft sensor models inevitably show some inaccurate predictions in their industrial applications. This work aims to develop an active learning method to sequentially select a data set with significant information to enhance latent variable model (LVM)-based soft sensors. Using the Gaussian process model to link the relationships between the score variables of LVM and the input process variables, the prediction variance can be formulated. And an uncertainty index of LVM is presented. It contains the variances of the predicted outputs and the changes of the predicted outputs per unit change in the designed inputs. Without any prior knowledge of the process, the index is sequentially used to adequately find out from which regions the new informative data should be adopted to enhance the model quality. Additionally, an evaluation criterion is proposed to monitor the active learning procedure. Consequently, the active learning procedures of exploration and exploitation analysis of the current model can effectively discover the meaningful data to be included into the soft sensor model. The proposed strategy can be applied to any types of LVMs. The effectiveness and the promising results are demonstrated through a numerical example and a real industrial plant in Taiwan with multiple outputs.