Journal of Physical Chemistry B, Vol.114, No.13, 4652-4663, 2010
Prediction of Hydration Structures around Hydrophilic Surfaces of Proteins by Using the Empirical Hydration Distribution Functions from a Database Analysis
We developed a knowledge-based program to predict hydration structures around hydrophilic surfaces of proteins as probability density. In the program, we assume that the three-dimensional distribution of hydration water molecules on a hydrophilic surface is reconstructed by summing up the empirical hydration distribution function of each solvent-exposed polar atom composing the surface. The probability functions of polar atoms in the CO, NHn (n = 1,3), and OH groups were calculated from the 17 984 protein structures solved by X-ray crystallography better than resolutions of 2.2 angstrom (Matsuoka, D.; Nakasako, M. J. Phys. Chem. B 2009, 113, 11274-11292). The program was first tested for human lysozyme. The predicted probability density enveloped more than 85% of crystal water sites found in the crystal structure refined at a resolution of 0.95 angstrom, and the density peaks suggested as hydration sites were located within 1.5 angstrom from more than 75% of the crystal water sites. The density reproduced the hydration structure in a solvent accessible narrow channel from the surface to the lysozyme interior. We also tested the feasibility of the program to predict the water clusters existing in the transmembrane channels of bacteriorhodopsin and aquaporin. In bacteriorhodopsin, the distributions were distinct between the ground state and the photoreaction intermediate indispensable for its function. The program reproduced the interfacial hydration in Per-Arnt-Sim-related protein-protein complex and the hydration of metastable conformations in domain motion of glutamate dehydrogenase. Taking the results for the various types of protein hydration, the present program may be a useful tool to characterize the surface properties of proteins and discuss the relevance of hydration structures to the biological functions of proteins. In addition, it will be used to predict hydration structures of proteins available at resolutions insufficient to identify water molecules.