Chemical Engineering Research & Design, Vol.94, 466-474, 2015
Multivariate data modeling using modified kernel partial least squares
There are two problems, which should be paid attention to when using kernel partial least squares (KPLS), one is overfitting and another is how to eliminate the useless information mixed in the independent variables X. In this paper, the stochastic gradient boosting (SGB) method is adopted to solve the overfitting problems and a new method called kernel net analyte preprocessing (KNAP) is proposed to remove undesirable systematic variation in X that is unrelated to Y. Thus, by combining the two methods, a final modeling approach named modified KPLS (MKPLS) is proposed. Two simulation experiments are carried out to evaluate the performance of the MKPLS method. The simulation results show that MKPLS method can not only be resistant to overfitting but also improve the prediction accuracy. (C) 2014 The Institution of Chemical Engineers. Published by Elsevier B.V. All rights reserved.
Keywords:Kernel partial least squares;Stochastic gradient boosting;Kernel net analyte preprocessing;Overfitting;Multivariate data modeling;Data preprocessing