Biotechnology Progress, Vol.24, No.3, 570-575, 2008
Residue-specific contact order and contact breadth in single-domain proteins: Implications for folding as a function of chain elongation
Cotranslational protein misfolding and aggregation are often responsible for inclusion body formation during in vivo protein expression. This study addresses the relations between protein folding/misfolding and the distribution of intramolecular interactions across different regions of the polypeptide chain in soluble single-domain proteins. The sequence regions examined here include the C terminus, which is synthesized last in the cell. Emphasis is placed on two parameters reporting on short- and long-range interactions, i.e., residue- specific contact order (RCO) and a new descriptor of intramolecular protein interaction networks denoted as residue-specific contact breadth (RCB). RCB illustrates the average spread in sequence of the residues serving as interaction counterparts. We show that both RCO and RCB are maximized at the chain termini for a large fraction of single-domain soluble proteins. A direct implication of this result is that the C terminus of the polypeptide chain, which is synthesized last during ribosome-assisted translation, plays a key role in the generation of native-like structure by establishing long-range interactions and generating contacts with interaction counterparts widely distributed across the sequence. Comparison of our computational predictions with the experimental behavior of selected proteins shows that the presence and absence of large RCO and RCB at the chain termini correlates with the protein's ability to properly fold either after the C terminus has been synthesized or during chain elongation, respectively.