초록 |
In this work, several computational methods are applied to feature selection for supervised learning problems. Methods are compared in three case studies in two representative supervised learning cases; (i) regression: multivariate calibration of soil carbonate content using Fourier transform mid-infrared (FT-MIR) spectral information, descriptor selection in quantitative structure retention time relationship modeling, and (ii) classification: diagnosis of prostate cancer patients using gene expression information. Beside quantitative performance measures: error and accuracy often used in feature selection studies, a qualitative measure, the selection index (SI), is introduced to evaluate the methods in terms of quality of selected features. Robustness is evaluated introducing artificially generated noise variables to both datasets. |