Proteins are biomolecules that play a key role in a wide diversity of vital functions, such as metabolism and signal transmission. Each protein is a linear chain of amino acids that folds into a flexible three-dimensional structure. Protein’s flexibility is widely believed to be essential for its function. Indeed, proteins usually achieve their main functions by binding other molecules, called ligands. Binding requires shape and chemical complementarity of the two molecules at their binding interface. Conformation selection theory suggests that the protein and the ligand exist in an ensemble of continuously deforming conformations and that the most compatible conformations recognize each other and bind together. Binding conformations of proteins often differ significantly from non-binding ones. To understand protein’s function one must be able to determine or predict these binding conformations.
Motion of a protein occurs at timescales that span several orders of magnitude. Thermal fluctuations, which occur in picoseconds, are small-amplitude, uncorrelated, harmonic motions of the individual atoms. In contrast, conformational deformations closely related to the protein’s function occur in microseconds to milliseconds. These slow deformations are usually large-scale, correlated, anharmonic motions that correspond to transitions between meta-stable states, such as binding and non-binding states. In this dissertation we are mainly interested in modeling structural heterogeneity associated with such slow deformations.
This dissertation presents new computational methods to study the flexibility of folded protein in the context of three important biological problems:
Computational modeling of structural heterogeneity in the folded state of a protein is a challenging problem, mainly because of the high-dimensionality of the protein’s conformation space and the very small relative volume of its feasible motion space. Although our methods are specific to each of the three problems, they share the same sample and select approach: they combine efficient sampling algorithms that allow us to represent structural heterogeneity in a folded protein by a collection of sampled conformations and selection algorithms that allow us to reliably pick the sampled conformations that provide a solution to the problem. In addition, they share several similar techniques, like efficient kinematic modeling, fast collision detection among atoms to handle van der Waals volume exclusion among atoms, and optimization techniques. This dissertation demonstrates the power of geometric computation and efficient sampling to model structural heterogeneity in the folded protein. v