Industrial & Engineering Chemistry Research, Vol.59, No.11, 5010-5021, 2020
Variable-Scale Probabilistic Just-in-Time Learning for Soft Sensor Development with Missing Data
Just-in-time learning (JITL) has been widely applied to data-driven modeling to deal with the nonlinearity problems in industrial processes. To mitigate the effects of noise existing in JITL, probabilistic JITL (PJITL) selects samples based on the probability distributions. Considering the existence of missing data situation, the PJITL algorithm could also cope with that. However, traditional JITL-based methods, including PJITL, cannot flexibly select the number of training samples for each query sample, which would in return influence the accuracy of prediction for a part of query samples. To solve this problem, we proposed a method named "variable-scale PJITL" (VS-PJITL) which can determine the sizes of the local models for each query sample using a new sample selection criterion. Based on the Euclidean distance, the sample selection criterion also applies to the variable-scale JITL (VS-JITL). Then, comparisons of VS-PJITL, PJITL, JITL, and VS JITL are tested on a simulated data set and a real industrial data set from the catalytic naphtha reforming process. By analyzing the two cases above, VS-PJITL is considered to have superior performance to the original PJITL (root-mean-square error reduced by 0.3355 and 0.4778).