Journal of Chemical and Engineering Data, Vol.60, No.5, 1377-1387, 2015
Quantitative Structure-Property Relationship Predictions of Critical Properties and Acentric Factors for Pure Compounds
Knowledge of critical constants and phase boundary pressure properties is essential to understanding thermodynamic behavior of substances and is often required in practical process design applications. Where critically evaluated data are unavailable, a quantitative structureproperty relationship (QSPR) regression method can be used to relate molecular properties (descriptors) to properties of interest. The relationship is trained and tested using existing critically evaluated data and is dynamic; as new data become available, the relationship can be updated to reflect changes. In this work, we use support vector regression (SVR) to develop estimation methods for critical properties and acentric factors based on critically evaluated data for over 900 pure compounds. From three-dimensional geometry and connectivity information, we calculate over 500 descriptors for each compound. A matrix of descriptor values defines the input vectors for SVR, whereas critically evaluated data for critical temperature, the ratio of critical temperature to critical pressure, and saturation reduced pressure form the targets. We determine optimal SVR parameters by minimizing the sum of absolute deviations between the SVR outputs and the target values. We use a genetic algorithm to find the Pareto front points that optimize the output fit while reducing the number of input vectors (descriptors). We use a single Pareto front point to make a final evaluation in SVR. To define uncertainties of predicted values, we use uncertainty propagation calculations based on a Monte Carlo method that employs Latin hypercube sampling.