Abstract. Optimizing amino acid conformation and identity is a central problem in computational protein design. Protein design algorithms must allow realistic protein flexibility to occur during this optimization, or they may fail to find the best sequence with the lowest energy. Most design algorithms implement side-chain flexibility by allowing the side chains to move between a small set of discrete, low-energy states, which we call rigid rotamers. In this work we show that allowing continuous side-chain flexibility (which we call continuous rotamers) greatly improves protein flexibility modeling. We present a large-scale study that compares the sequences and best energy conformations in 69 protein-core redesigns using a rigid-rotamer model versus a continuous-rotamer model. We show that in nearly all of our redesigns the sequence found by the continuous-rotamer model is different and has a lower energy than the one found by the rigid-rotamer model. Moreover, the sequences found by the continuous-rotamer model are more similar to the native sequences. We then show that the seemingly easy solution of sampling more rigid rotamers within the continuous region is not a practical alternative to a continuous-rotamer model: at computationally feasible resolutions, using more rigid rotamers was never better than a continuous-rotamer model and almost always resulted in higher energies. Finally, we present a new protein design algorithm based on the dead-end elimination (DEE) algorithm, which we call iMinDEE, that makes the use of continuous rotamers feasible in larger systems. iMinDEE guarantees finding the optimal answer while pruning the search space with close to the same efficiency of DEE.
Author Summary. Computational protein design is a promising field with many biomedical applications, such as drug design, or the redesign of new enzymes to perform non-natural chemical reactions. An essential feature of any protein design algorithm is the ability to accurately model the flexibility that occurs in real proteins. In enzyme design, for example, an algorithm must predict how the designed protein will change during binding and catalysis. In this work we present a large-scale study of 69 protein redesigns that shows the necessity of modeling more realistic protein flexibility. Specifically, we model the continuous space around low-energy conformations of amino-acid side chains, and compare it against the standard rigid approach of modeling only a small discrete set of low-energy conformations. We show that by allowing the side chains to move in the continuous space around low-energy conformations during the protein design search, we obtain very different sequences that better match real protein sequences. Moreover, we propose a new protein design algorithm that, contrary to conventional wisdom, shows that we can search the continuous space around side chains with close to the same efficiency as algorithms that model only a discrete set of conformations.
Read the free PDF.