Journal of the Chinese Institute of Chemical Engineers, Vol.38, No.1, 63-70, 2007
Prediction of disulfide connectivity in proteins with support vector machine
Disulfide bonds stabilize protein structures and play an important role in protein folding. Predicting disulfide connectivity precisely is an important task for determining the structural/functional relationships of proteins. The accuracy obtained by conventional disulfide connectivity predictions using sequence information only is limited. In this study, we aimed to develop a new method to improve the prediction accuracy of disulfide connectivity using support vector machine (SVM) with prior knowledge of disulfide bonding states and evolutionary information. The separations among the oxidized cysteine residues on a protein sequence have been encoded into vectors named cysteine separation profiles (CSPs). Our previous prediction of disulfide connectivity for non-redundant proteins in SwissProt release no. 39 (SP39) sharing less than 30% sequence identity has yielded the accuracy of 49% using CSP method alone. In this study, for proteins from the same dataset, an even better fourfold cross-validation accuracy of 62% was achieved using SVM with CSP as a feature. (c) 2007 Taiwan Institute of Chemical Engineers. Published by Elsevier B.V. All rights reserved.
Keywords:disuffide bond;protein folding;disulfide connectivity;support vector machine (SVM);cysteine separation profile (CSP)