International Journal of Molecular Sciences, Vol.16, No.8, 17315-17330, 2015
DeepCNF-D: Predicting Protein Order/Disorder Regions by Weighted Deep Convolutional Neural Fields
Intrinsically disordered proteins or protein regions are involved in key biological processes including regulation of transcription, signal transduction, and alternative splicing. Accurately predicting order/disorder regions ab initio from the protein sequence is a prerequisite step for further analysis of functions and mechanisms for these disordered regions. This work presents a learning method, weighted DeepCNF (Deep Convolutional Neural Fields), to improve the accuracy of order/disorder prediction by exploiting the long-range sequential information and the interdependency between adjacent order/disorder labels and by assigning different weights for each label during training and prediction to solve the label imbalance issue. Evaluated by the CASP9 and CASP10 targets, our method obtains 0.855 and 0.898 AUC values, which are higher than the state-of-the-art single ab initio predictors.
Keywords:intrinsically disordered proteins;prediction of disordered regions;machine learning;deep learning;deep convolutional neural network;conditional neural field