The Curious Wavefunction: Strain Energies in Ligand Binding: Round Two- Fight!

Or why to be wary of ligands in the PDB, force field energies, and anybody who tells you not to be wary of these two

One of the longstanding questions in protein-ligand binding has been; what is the energy penalty that a protein has to pay in order to bind a ligand? Another question is; what is the strain energy that a protein pays in order to bind the ligand? Contrary to what one might initially think, the two questions are not the same. Strain energy is the price paid to twist the conformation of the ligand into the binding conformation. Free energy of binding is the energy that the protein has to pay in addition to the strain energy in order to bind the ligand.

A few years ago, this question shot into the limelight because of a publication in J. Med. Chem. by Perola et al. from Vertex. The authors did a meticulous study of hundreds of ligands in their protein-bound complexes, some from the PDB and others proprietary. They used force fields to estimate the difference between the energy of the bound conformation of the ligands and the nearest local energy minimum conformation- the strain energy penalty. For most ligands, they obtained strain energies ranging from 2-5 kcal/mol. But what raised eyebrows was that for a rather significant minority of ligands, the strain energies seemed to be more than 10 kcal/mol, and for some they seemed to be up to 20 kcal/mol.

These are extremely high numbers. To understand why this is so, consider a fact that I have frequently emphasized on this blog; the concentration of a particular conformation in solution is virtually negligible if the free energy difference between it and a stable conformation is about only 3 kcal/mol. For a conformation to pay that much of an energy penalty in order to transform itself into the bound conformation would already be a stretch, considering its low concentration. For a conformation to pay an energy penalty of 20 kcal/mol does not make sense at all in this light, since such a conformation should be non-existent. Plus, think about the fact that hydrogen bonds usually contribute about 5 kcal/mol and that energy at room temperature is itself about 20 kcal/mol- significantly greater than the rotational barriers in most molecules- and this number for the strain energy penalty starts looking humungous. Where exactly would it come from?

Perola's paper generated a lot of buzz- a good thing. It was discussed by speakers at a conference in March last year that I attended. Now, a paper in J. Comp. Chem. seems to clear up the air a little. In a nutshell, the authors conclude that the strain energies they have measured seldom, if ever, surpass 2 kcal/mol. Needless to say, this is a huge difference compared to the earlier studies.

Why such a startling difference? It seems that as always, the answer strongly depends on the method and the data.

First of all, the PDB is not as flawless as people assume it is. Most people who are crystallizing protein-ligand complexes are first and foremost interested in the structure of the protein. They often do a poor job of fitting ligands to the electron density; Gerard Kleywegt of the University of Uppsala has done some marvelous work on detecting errors in PDB ligands, and his review on this should be a must-read for all scientists even marginally connected with crystallography. Because of poor fits, conformations of ligands in the electron densities in the PDB can be completely unrealistic and at the very least, brutally strained. Amides can be cis or non-planar, and more rarely planar aromatic rings can be deformed. There can be severe steric clashes which are not easily apparent. Quite naturally, such conformations when refined would lead to huge drops in energy. Therein lies the first source of the unrealistically large strain energy differences.

The second factor has to do with the vagaries and inadequacies of force fields, often unknown to crystallographers but known to experienced computational chemists. Force fields are quite poor at determining energies and their results are especially skewed by an overemphasis on electrostatic interactions which the force fields are ill-equipped to damp. Now consider what happens when a ligand in a PDB that has a positively and negatively charged group in it is optimized. If you relax it to the nearest local energy minimum, these two groups would instantly snap together and form a very strong ionic bond. This would lead to a huge overstabilization of the conformation, thus again giving the illusion of a large strain energy difference between the PDB conformation and the local minimum.

Finally, the devil is in the details. In doing the initial refinement of the conformation, the earlier study used a constraint called the flat-bottom potential in optimizing the PDB ligands in their bound state. However the flat-bottom potential, which extracts no penalties for atomic movement within a certain short distance and suddenly ramps up the penalty, is not physically realistic. A better method might be to use a harmonic potential which continuously and smoothy extracts a penalty proportional to atomic displacement.

The present study takes all these factors into account and also substitutes the force field results with some well-established quantum chemical energy determinations at the B3LYP/6-31G* level. They use this method to calculate the energies of bound and local energy minimum conformations. Secondly, they use a well-established continuum solvation model (PCM) as incorporated in the latest version of the Gaussian program to incorporate damping effects due to solvation. Thirdly as indicated above, they use the harmonic potential for optimization. Fourthly and most importantly, for the cases where the strain energy seems unusually high (and even there they set the bar quite high- anything greater than 2 kcal/mol), the authors closely investigate the relevant PDB entries and find that indeed, the ligands were not fit well into the electron density and had unrealistically strained conformations.

Once they tackled these problems, the strain energies all fell down to between 0.5 and 2 kcal/mol, which seems to be a realistic penalty that a conformation with a respectable concentration in solution could pay. There is now a second question; what is the maximum strain energy penalty that a ligand can pay to be transformed into the bound conformation? The authors are working on this question, and we will await their answer.

But this study reiterates two important lessons that should be remembered by anyone dealing with structure at all times:
1. Don't trust the PDB
2. Don't trust force field energies

Better still, as old Fox Mulder said, trust no one and nothing.

References:
1. Keith T. Butler, F. Javier Luque, Xavier Barril (2009). Toward accurate relative energy predictions of the bioactive conformation of drugs Journal of Computational Chemistry, 30 (4), 601-610 DOI: 10.1002/jcc.21087

2. Emanuele Perola, Paul S. Charifson (2004). Conformational Analysis of Drug-Like Molecules Bound to Proteins: An Extensive Study of Ligand Reorganization upon Binding Journal of Medicinal Chemistry, 47 (10), 2499-2510 DOI: 10.1021/jm030563w

3. A Davis, S Stgallay, G Kleywegt (2008). Limitations and lessons in the use of X-ray structural information in drug design Drug Discovery Today, 13 (19-20), 831-841 DOI: 10.1016/j.drudis.2008.06.006

4 comments:

Anonymous6:34 PM, January 27, 2009
Have the force fields described in the post been applied to the folding of RNA? If so, how well do they work. If not, have force fields been invented for RNA folding? If so, how well do they work? I have an ulterior motive for asking.

Retread
Wavefunction8:08 AM, January 28, 2009
Although I have not personally used MM for nucleic acid modeling, here are some references that i thought were relevant from a PubMed search. Depending on what you want to accomplish though, I would be strongly critical about using classical force fields to model the highly charged and ionic nucleic acids because of the well-known force field inadequacies in modeling electrostatic interactions that I have mentioned in previous posts. Let me know if you can't get your hands on specific references.

Trans Hoogsteen/Sugar Edge Base Pairing in RNA. Structures, Energies, and Stabilities from Quantum Chemical Calculations.
Mládek A, Sharma P, Mitra A, Bhattacharyya D, Šponer J, Šponer JE.
J Phys Chem B. 2009 Jan 16. [Epub ahead of print]

Motifs in nucleic acids: molecular mechanics restraints for base pairing and base stacking.
Harvey SC, Wang C, Teletchea S, Lavery R.
J Comput Chem. 2003 Jan 15;24(1):1-9.
PMID: 12483670 [PubMed - indexed for MEDLINE]

Simulations of nucleic acids and their complexes.
Giudice E, Lavery R.
Acc Chem Res. 2002 Jun;35(6):350-7. Review.
PMID: 12069619 [PubMed - indexed for MEDLINE]

Development and current status of the CHARMM force field for nucleic acids.
MacKerell AD Jr, Banavali N, Foloppe N.
Biopolymers. 2000-2001;56(4):257-65.
PMID: 11754339 [PubMed - indexed for MEDLINE]
Anonymous9:41 PM, February 01, 2009
Amides can be cis or non-planar, and more rarely planar aromatic rings can be deformed

LOL. This reminds me how I spent inordinate amount of time trying to force refinement of a very large ligand into a perfect plane - as it should be based on chemical considerations. (Even went into trouble to calculate lowest energy conformation in Gaussian). Only after every attempt to do so failed I have woke up and realized the reality: it wasn't plain! It was clearly bent. And a high-quality data at 1.45A was telling me so all along. Yep, a conjugate of several planar aromatic rings was badly bent because of tight interaction with protein.

Trust me: PDB has plenty of errors of various kinds but it's not as full of shit as some people make it. Compared to the rest of biology, it's a paradise of hard-core trustable data! (I quit work in cell biology because of this).

One thing to keep in mind is that, owing to the limited resolution, atomic positions of most atoms in majority of protein structures are uncertain to ~ 0.15-0.25A. And when a ligand does not bind tightly, it is even less well defined. So it's just a matter of spotting those "too good to be true" cases.
Wavefunction10:39 AM, February 05, 2009
I remember reading a rule of thumb somewhere which says that the uncertainty in atomic positions is a sixth of the resolution. So for a 2.4 A structure the uncertainty will be about 0.4 A/atom. This is not inconsiderable.

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS