Abstract. Optimizing amino acid conformation and
identity is a central problem in computational protein design. Protein
design algorithms must allow realistic protein flexibility to occur
during this optimization, or they may fail to find the best sequence
with the lowest energy. Most design algorithms implement side-chain
flexibility by allowing the side chains to move between a small set of
discrete, low-energy states, which we call rigid rotamers. In
this work we show that allowing continuous side-chain flexibility
(which we call continuous rotamers) greatly improves protein
flexibility modeling. We present a large-scale study that compares
the sequences and best energy conformations in 69 protein-core
redesigns using a rigid-rotamer model versus a continuous-rotamer
model. We show that in nearly all of our redesigns the sequence found
by the continuous-rotamer model is different and has a lower energy
than the one found by the rigid-rotamer model. Moreover, the sequences
found by the continuous-rotamer model are more similar to the native
sequences. We then show that the seemingly easy solution of sampling
more rigid rotamers within the continuous region is not a practical
alternative to a continuous-rotamer model: at computationally feasible
resolutions, using more rigid rotamers was never better than a
continuous-rotamer model and almost always resulted in higher
energies. Finally, we present a new protein design algorithm based on
the dead-end elimination (DEE) algorithm, which we call
iMinDEE, that makes the use of continuous rotamers feasible
in larger systems. iMinDEE guarantees finding the optimal
answer while pruning the search space with close to the same efficiency
of DEE.
Author Summary. Computational protein design is a promising field with many biomedical applications, such as drug design, or the redesign of new enzymes to perform non-natural chemical reactions. An essential feature of any protein design algorithm is the ability to accurately model the flexibility that occurs in real proteins. In enzyme design, for example, an algorithm must predict how the designed protein will change during binding and catalysis. In this work we present a large-scale study of 69 protein redesigns that shows the necessity of modeling more realistic protein flexibility. Specifically, we model the continuous space around low-energy conformations of amino-acid side chains, and compare it against the standard rigid approach of modeling only a small discrete set of low-energy conformations. We show that by allowing the side chains to move in the continuous space around low-energy conformations during the protein design search, we obtain very different sequences that better match real protein sequences. Moreover, we propose a new protein design algorithm that, contrary to conventional wisdom, shows that we can search the continuous space around side chains with close to the same efficiency as algorithms that model only a discrete set of conformations.
Read the free PDF.