International Journal of Molecular Sciences, Vol.5, No.2, 56-66, 2004
QSAR study of skin sensitization using local lymph node assay data
Allergic Contact Dermatitis (ACD) is a common work-related skin disease that often develops as a result of repetitive skin exposures to a sensitizing chemical agent. A variety of experimental tests have been suggested to assess the skin sensitization potential. We applied a method of Quantitative Structure-Activity Relationship (QSAR) to relate measured and calculated physical-chemical properties of chemical compounds to their sensitization potential. Using statistical methods, each of these properties, called molecular descriptors, was tested for its propensity to predict the sensitization potential. A few of the most informative descriptors were subsequently selected to build a model of skin sensitization. In this work sensitization data for the murine Local Lymph Node Assay ( LLNA) were used. In principle, LLNA provides a standardized continuous scale suitable for quantitative assessment of skin sensitization. However, at present many LLNA results are still reported on a dichotomous scale, which is consistent with the scale of guinea pig tests, which were widely used in past years. Therefore, in this study only a dichotomous version of the LLNA data was used. To the statistical end, we relied on the logistic regression approach. This approach provides a statistical tool for investigating and predicting skin sensitization that is expressed only in categorical terms of activity and non-activity. Based on the data of compounds used in this study, our results suggest a QSAR model of ACD that is based on the following descriptors: nDB (number of double bonds), C-003 (number of CHR3 molecular subfragments), GATS6M (autocorrelation coefficient) and HATS6m (GETAWAY descriptor), although the relevance of the identified descriptors to the continuous ACD QSAR has yet to be shown. The proposed QSAR model gives a percentage of positively predicted responses of 83% on the training set of compounds, and in cross validation it correctly identifies 79% of responses.