Chemical Engineering Science, Vol.66, No.12, 2606-2615, 2011
Analysis and refinement of the training set in predicting a variety of constant pure compound properties by the targeted QSPR method
The possibility of obtaining reliable predictions of a wide variety of constant properties is examined. To this aim, a modified version of the Targeted QSPR (Brauner et al., 2006) method is applied. In the present study new statistical indicators are introduced, which enable a reliable estimation of the prediction uncertainty for the (unknown) property of the target compound based on the training set data. It is shown that while increasing the number of descriptors in the QSPR enables better representation of the training set data, it may significantly deteriorate the prediction of the target compound property value. If necessary, improved prediction is achievable by using the statistical information to refine the training set, rather than by increasing the number of the descriptors used. It is demonstrated that by proper adjustment of the training set, the great majority of the constant properties can be predicted within the experimental error level. (C) 2011 Elsevier Ltd. All rights reserved.
Keywords:Computation chemistry;Parameter identification;Molecular descriptor;Systems engineering;QSPR;Property prediction