The Curious Wavefunction: Can you at least get the solvation energy right?

Basic physical property measurement and prediction is not supported at the granting level and is considered too far from the issues directly affecting drug development to have been pursued by industry. This has left a critical gap in the basic scientific method that drives theoretical methods forward, that is, the observation, hypothesis, and testing methodology that Bacon, al-Haytham, and others championed and that Galileo applied to great effect in the formulative years of modern science...if basic physical science is supported in this area there is great potential for improvement and eventual achievement of long-desired goals of molecular modeling in the pharmaceutical industry- Prescient Soothsayers of Solvation

Sometimes it's a wonder computational predictions of protein and ligand activity work at all. Consider the number of factors we still don't have a good handle on; among other things, calculating protein conformational entropy is virtually beyond reach, calculation of hydrogen bond strengths that depend intimately on the surrounding environment is still quite tricky and calculation of favourable hydrophobic entropy gain because of expulsion of water molecules from the the active site is still a murky area.

But there are things even simpler than these which we have not learnt to calculate well. Foremost among these is a crucial factor influencing every instance of protein ligand binding, the interaction of both assemblies with bulk water. If we can't even get the aqueous solvation energy right, can we make a statement about progress in modeling protein-ligand interactions at all? Water has been probably the most studied solvent for decades and dozens of water models have sprung up, none of which is significantly superior in calculating the properties of this stunningly deceptively simple liquid.

The two foremost implicit methods (as opposed to explicit solvent methods like MD) currently used for calculating solvation energy are the Born solvation method and ones based on the Poisson-Boltzmann equation. Calculating solvation energy ultimately will involve getting the basic science right. With this view in mind, a group from OpenEye and Astra Zeneca narrate their successes and failures in a blind test for calculating solvation energies of 56 druglike organic molecules called SAMPL1. They do a fine job in investigating individual cases and talking about the effect of two crucial variables on the solvation energies; atomic radii (which inversely relate to the solvation) and even more importantly, charges. The group essentially fiddle around with these two variables, modifying the charges and the atomic radii until they get the solvation energy about right. It's a classic case of both the virtues and pitfalls of parametrization and indicates that real parameterization should not involve blindly adding terms to get experimental agreement but instead focus on the two or three scientifically most interesting and important variables.

Believe it or not, but there are a dozen different methods for calculating atomic charges in computational chemistry. Fixed charge models don't capture a very important phenomenon- polarization- that can profoundly affect bond strengths and especially hydrogen bond strengths. In real life charges on atoms don't stay constant in a changing environment. At the same time there is no one "correct" charge model, and as in the case of models in general, what matters ultimately is a model that works. In an earlier blind test, the group had used a particular quantum chemical method called AM1-BCC to calculate charges, and this gave them a mean error of about 2 kcal/mol in the solvation energy. The AM1-BCC method is a well-established semiempirical method that actually calculates slightly overpolarized charges, thus fortuitously and conveniently mimicking the change in charge distribution for a molecule as it transfers from the gas to the aqueous medium. In this paper the group calculate charges at the DFT level and find that this makes a significant difference for a large subset of the previous molecules.

Another interesting phenomenon investigated in the study is the effect of conformations on the calculation of solvation energy. The first axiomatic truth to realize is that molecules exist as several different conformations in both gas and aqueous phases. But low energy conformations for a typical organic molecule in the gas phase will be very different from aqueous conformations. Conformations calculated in the gas phase are typically 'collapsed' and have oppositely charged polar groups too close for comfort because of the lack of intervening solvent that would usually break them up. If you want to use only one conformation for a solvation energy calclation, you would use a collapsed gas phase conformation and a relatively extended aqueous phase conformation. Ideally though you should be more realistic and should use multiple conformations. In the study, the effect of multiple conformations for calculating the vacuum and aqueous phase partition functions and solvation free energy was studied. Interestingly the results obtained with multiple conformations are generally worse than the results obtained with single conformations! There must probably be some added noise that is introduced from unrealistic calculated conformations. The authors also find out, not surprisingly, that using different charges for different conformations of the same molecule can make a difference, although not much. At the same time charges for certain atoms don't change much if the atoms are buried; a failure to realize this leads to two screaming outliers, which however only provides a good opportunity to learn what's wrong.

There are several interesting paragraphs on how the authors played with the atomic radii and the charges and how they explained and were puzzled by outliers. In the end, a particular combination of DFT charges along with a particular combination of radii (termed ZAP10 radii) provided the smallest error in calculation of solvation energies. Interestingly some radii had to be maintained at their default Bondi radii values (which are derived from crystal data) in order to work well.

What I like about this study is that it is told from the real-time viewpoint and illustrates the calculation as it actually evolved. The pitfalls and the possibilities are cogently explored. Certain functional groups and atom types seem to perform better than others. It is clear that much care is devoted to understanding the basic science.

The basic science is also going to involve the accurate experimental determination of solvation energies. Such measurements are typically considered too mundane and basic to be funded. And yet, as the authors make clear in the paragraph quoted at the beginning, it's only such measurements that are going to aid the calculation of aqueous solvation energies. And these calculations are going to be ultimately key to calculating drug-protein interactions. After all, if you cannot even get the solvation energy right...

Nicholls, A., Wlodek, S., & Grant, J. (2009). The SAMP1 Solvation Challenge: Further Lessons Regarding the Pitfalls of Parametrization The Journal of Physical Chemistry B, 113 (14), 4521-4532 DOI: 10.1021/jp806855q

2 comments:

Jean-Claude Bradley7:23 AM, September 03, 2009
How much of this is applicable to non-aqueous solvation?
Wavefunction1:48 PM, September 03, 2009
Good question. For one thing there's not much prediction about non-aqueous solvents because of the paucity of experimental data. I can't see why the theoretical framework itself should not be applicable to any solvent though.

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS